Program

Digital Humanities: Digital Humanities Advancement Grants

Period of Performance

9/1/2017 - 8/31/2021

Funding Totals

$325,000.00 (approved)
$301,941.98 (awarded)


Text in Situ: Reasoning about Visual Information in the Computational Analysis of Books

FAIN: HAA-256044-17

Carnegie Mellon University (Pittsburgh, PA 15213-3815)
Taylor Berg-Kirkpatrick (Project Director: January 2017 to March 2025)
David Bamman (Co Project Director: May 2017 to March 2025)

Participating institutions:
Carnegie Mellon University (Pittsburgh, PA) - Applicant/Recipient
Regents of the University of California, Berkeley (Berkeley, CA) - Participating Institution

Implementation of three studies and creation of software tools that computationally analyze visual information about printed books. Partners include the Folger Shakespeare Library and the HathiTrust Research Center.

While humanistic inquiry traditionally involves synthesizing a rich set of contextual information, computational approaches to text analysis introduce several forms of simplification, beginning from the initial act of digitization. In this work, we advocate for an alternative that seeks to reason about text within a rich material context: as ink on paper. We propose new computational approaches to three tasks: using visual information about the physical layout of pages to segment the document structure of books in the HathiTrust; reconstructing lacunae (physical gaps in the medium of writing), and attributing and identifying compositors from visual cues in typesetting (using Shakespeare’s First Folio). Our core unifying principle is reasoning about text holistically—awareness of a text’s rich material context can not only shape the historical questions we ask of large-scale book corpora, but can also be informative for traditional tasks that text alone has been used to answer.