Program

Preservation and Access: JISC/NEH Transatlantic Digitization Collaboration Grants

Period of Performance

4/1/2008 - 3/31/2009

Funding Totals

$106,395.00 (approved)
$106,395.00 (awarded)


The World Wide Web of Humanities

FAIN: PX-50016-08

Internet Archive (San Francisco, CA 94129-1711)
Kristin Carpenter Negulescu (Project Director: November 2007 to August 2009)

Development of tools and methodologies for indexing and analyzing the textual parts of larger digital collections, more focused browsing ("crawling") of the Web, and unified access to data resources.

The Internet Archive (IA) houses one of the largest publicly accessible collections of digital artifacts in the world, with more than 110 billion Web captures that include content from more than 65 million Web sites in over 40 languages. In addition, IA maintains a collection of more than one million public domain texts, thousands of still and moving images, and numerous multimedia software titles. IA plays an active role in the development of digital archiving standards and open-source tools and participates in the Open Content Alliance and the International Internet Preservation Consortium. The English partners in the project are the Oxford Internet Institute, which specializes in researching the effects of the Web and technology on scholarship and teaching, and Hanzo, a company focused on advanced Web archiving technologies. The proposed project would use advanced hyperlink analysis and data mining to study how research in the digital humanities has been framed, funded, and implemented internationally over time. The resulting data would be archived in a specialized collection of several million Uniform Resource Identifiers that would be made available for future research and analysis. The collection would include a range of content such as digital humanities project Web sites and tools, portals, datasets, and research reports. In addition to the data archive outlining the development and current state of the humanities on the Web, the applicant would index the collection for full-text search, encode the indexed data, and make available an interface and tools for future research. All these resources would be freely available on the Web.





Associated Products

World Wide Web of Humanities WWI and WWII web site (Web Resource)
Title: World Wide Web of Humanities WWI and WWII web site
Author: Hanzo Archives
Author: Oxford Internet Institute
Author: Internet Archive
Abstract: This one year project sought to close the gap between the lack of supporting infrastructure available to humanities scholars and the desire of researchers to access and study resources digitized and/or published to the Web. The project sought to establish a possible framework for e-Humanities research using available open source tools and technologies and archived web content. Sample collections of materials relating to World Wars I & II were compiled to help illustrate researcher needs and requirements and in anticipation of further development of tools for working with very large volumes of data housed by digital archives around the globe. The WWWoH project was a collaboration between Internet Archive(IA), The Oxford Internet Insitute (OII), and Hanzo Archives. OII and the Australian National University contributed expertise in web research and led the curation of these collections for application and use by humanities scholars. Both IA and Hanzo Archives a[pplied extensive experience in web archiving and in the creation of web collections to extract, assemble, and make accessible, archival data germane to this project.
Year: 2009
Primary URL: http://wwwoh-access.archive.org/wwwoh/
Primary URL Description: This is the traditional browse and search interface assembled for the project to each collection.