Enhanced Access to Digital Humanities Monographs
FAIN: HD-50065-07
Syracuse University (Syracuse, NY 13244-0001)
Anne Roel Diekema (Project Director: November 2006 to April 2008)
Creation of a proof-of-concept system that employs Natural Language Processing techniques and utilizes information contained in tables of contents and back-of-the-book indexes for more precise searching of the content of electronic books.
Research shows that monographs are a key source of information for researchers in the humanities. Unfortunately, modern day search technology is not well suited to monograph access because most full-text retrieval systems have been developed for the search and retrieval of web pages or journal articles which tend to have many fewer words than the average book. We propose to apply Natural Language Processing techniques to utilize the rich, intellectually-viable information contained in tables of contents and back-of-the-book indexes in traditional information retrieval and browsing systems, thus making monographs accessible by capitalizing on the internal structure of the book. We believe this automation effort will ease the task of making the content of electronic books more precisely accessible, ultimately allowing humanities scholars to carry out their research even as the preferred resources become digitized.