Program

Digital Humanities: Digital Humanities Advancement Grants

Period of Performance

9/1/2019 - 12/30/2022

Funding Totals

$324,733.00 (approved)
$324,733.00 (awarded)


Datascribe: Enabling Structured Data Transcription in the Omeka S Web Platform

FAIN: HAA-266444-19

George Mason University (Fairfax, VA 22030-4444)
Jessica Otis (Project Director: January 2019 to present)
Lincoln A. Mullen (Co Project Director: May 2019 to present)

The creation of a structured data transcription module for the Omeka S platform that will make it easier for scholars working with quantitative data (such as government forms or institutional records) to transcribe them into structured data which can be analyzed or visualized.

Datascribe is an application for a Level III Digital Humanities Advancement Grant to create a structured data transcription module, or plug-in, for the Omeka S platform for digital collections. Scholars often collect sources, such as government forms or institutional records, intending to transcribe them into datasets which can be analyzed or visualized. Existing software enables transcription into free-form text but not into tables of data. The proposed module will enable scholars to identify the structure of the data within their sources, speed up the transcription of their sources, and reliably structure their transcriptions in a form amenable to computational analysis. Scholars will be able to turn sources into tables of data stored as numbers, dates, or categories. This module will build on the Omeka S platform, enabling scholars to display transcriptions alongside the source images and metadata, to crowdsource transcriptions, and to publish their results on the web.





Associated Products

DataScribe (Computer Program)
Title: DataScribe
Author: Ken Albers
Author: Megan Brett
Author: Lincoln Mullen
Author: Kim Nguyen
Author: Jessica Otis
Author: Jim Safley
Abstract: Scholars often collect sources, such as government forms or institutional records, intending to transcribe them into datasets which can be analyzed or visualized. This module enables scholars to identify the structure of the data within their sources, speed up the transcription of their sources, and reliably structure their transcriptions in a form amenable to computational analysis. Scholars can turn sources into tables of data stored as numbers, dates, categories, and more. Because this module builds on the Omeka S platform, it allows scholars to display transcriptions alongside the source images and metadata and to publish their results on the web.
Year: 2022
Primary URL: http://datascribe.tech
Secondary URL: http://omeka.org
Access Model: open source
Source Available?: Yes

Documentation (Web Resource)
Title: Documentation
Author: Jessica Otis
Author: Lincoln Mullen
Author: Greta Swain
Author: Megan Brett
Author: Daniel Howlett
Author: Emily Meyers
Author: HernĂ¡n Adasme
Abstract: Case studies, tutorials, and other documentation about the project
Year: 2022
Primary URL: http://datascribe.tech

DataScribe: An Omeka S module for structured data transcription (Article)
Title: DataScribe: An Omeka S module for structured data transcription
Author: Jessica M. Otis
Author: James Safley
Author: Megan Brett
Author: Lincoln Mullen
Abstract: Article on the DataScribe structured data transcription module published in The Journal of Open Source Software
Year: 2024
Primary URL: https://joss.theoj.org/papers/10.21105/joss.05661
Primary URL Description: Direct link to journal article
Access Model: open access
Format: Journal
Periodical Title: The Journal of Open Source Software