Search Criteria


Key Word Search by:

Organization Type

State or Jurisdiction

Congressional District


Division or Office

Grants to:

Date Range Start

Date Range End

  • Special Searches

    Product Type

    Media Coverage Type


Search Results

Grant number like: HJ-50173-14

Permalink for this Search

Page size:
 1 items in 1 pages
Award Number Grant ProgramAward RecipientProject TitleAward PeriodApproved Award Total
Page size:
 1 items in 1 pages
HJ-50173-14Digital Humanities: Digging into DataPresident and Fellows of Harvard CollegeAutomating Data Extraction from Chinese Texts2/1/2014 - 1/31/2017$125,000.00PeterK.Bol   President and Fellows of Harvard CollegeCambridgeMA02138-3800USA2013Interdisciplinary Studies, OtherDigging into DataDigital Humanities12500001250000

The development of the Automating Data Extraction from Chinese Texts platform to allow scholars to transform texts written in classical Chinese into highly structured data suitable for the application of text mining techniques. The project is led by humanities scholars and computer scientists from Harvard University (US) and King's College, London (UK) with additional expertise provided by scholars from National Taiwan University and Academia Sinica, Taiwan. The UK partner is requesting £125,000 from the UK funding consortium.

The Automating Data Extraction from Chinese Texts Project aims to provide humanists and social scientists with a means of transforming 2200 years of Chinese texts into structured data. The project will fully develop an open-source platform that allows its users to apply sophisticated text-mining techniques, hitherto the domain of information scientists, to a wide variety of historical and literary texts. Users interested in biographical data, for example, will be able to tag and extract personal names, dates, place names, official titles and postings, kinship ties, and other social relationships. The platform will be tested against 2000 local histories spanning an 800-year period and 19,000 letters and 500 notebooks dating from the seventh through the thirteenth century. Data extracted from the sample repositories will be used to enrich text-mining applications and will also be made available in English and Chinese for research through open-access online databases and data archives.