OCR

Digital Humanist aims to run OCR over a terabyte of rare book scans

Adam Anderson, Mellon Postdoctoral Fellow in the Digital Humanities

Since his college days at Brigham Young University (BYU), Adam Anderson has been measuring evenings and weekends in pages, rather than hours. “You can scan about 400 pages an hour, once you get in the groove,” he explains. Anderson, a Mellon Postdoctoral Fellow in Digital Humanities at UC Berkeley, has spent his career scanning texts in order to draw upon secondary literature in archaeology and computational linguistics.

Go from Analog to Digital Texts with OCR

An early modern text (English)

A collection of digitized texts marks the start of a research project —  or does it?

For many social sciences and humanities researchers, creating searchable, editable, and machine-readable digital texts out of heaps of paper in archival boxes or from books painstakingly sourced from overlooked corners of the library can be a tedious, time-consuming process.