HK-50176-14Digital Humanities: Digital Humanities Implementation GrantsBoard of Trustees of the University of IllinoisExploring the Billions and Billions of Words in the HathiTrust Corpus with Bookworm: HathiTrust + Bookworm Project9/1/2014 - 8/31/2017$324,841.00J. Stephen DownieErez Lieberman-AidenBoard of Trustees of the University of IllinoisChampaignIL61801-3620USA2014Interdisciplinary Studies, GeneralDigital Humanities Implementation GrantsDigital Humanities32484103248410

The enhancement and integration of the Bookworm analytical tool with the HathiTrust Digital Library, which holds 3.9 billion pages of digitized materials. Scholars would be able build individual collections of materials to be studied and to discover new textual use patterns across the corpus.

The HathiTrust + Bookworm (HT+BW) Project provides scholars new ways to explore trends within the massive HathiTrust corpus. Detailed exploration of metadata facets adds analytic value over such tools as Google Ngram Viewer. It enables scholars to explore personal worksets and aids discovery of new works. It will help the HathiTrust Research Center provide computational access to the HathiTrust corpus. Open-source improvements to Bookworm code will increase value to other large text projects.