Preservation and Access: Research and Development

Period of Performance

1/1/2015 - 9/30/2017

Funding Totals (outright + matching)

$264,700.00 (approved)
$264,699.99 (awarded)

Universal Scripts Project

FAIN: PR-50205-15

University of California, Berkeley (Berkeley, CA 94704-5940)
Deborah Winthrop Anderson (Project Director: May 2014 to January 2018)

The preparation of twelve scripts—seven historical and five modern—for inclusion in the international Unicode standard, to aid research using materials in historical scripts and promote communication in minority language communities.

Although computer and mobile users in many parts of the world can now communicate in hundreds of languages by using their own native writing system, there are still linguistic minority groups, and users of historical writing systems, who cannot. This is because the letters and symbols of these scripts are not yet part of the international character encoding standard, known as Unicode. More than one hundred eligible scripts are not yet included in Unicode, which directly affects humanities research, the creation of the global digital repository of humankind's literary and cultural heritage and, for users of modern scripts, basic communication. This project will fund proposals for five modern and seven historical scripts for inclusion in the standard, thereby preserving text materials in these scripts and paving the way for electronic communication in (and about) scripts by scholars and the user communities at large.

Media Coverage

The Alphabet That Will Save a People From Disappearing (Media Coverage)
Author(s): Kaveh Waddell
Publication: The Atlantic
Date: 11/16/2016
Abstract: Article about the Adlam script, with mention of the NEH-sponsored project Script Encoding Initiative (=Universal Scripts Project), which supported the encoding of the script into the Unicode Standard.

Unicode: A story of corruption, connection, and smiling poo: The people who made the internet open to anyone (Media Coverage)
Author(s): Maggie Shafer
Publication: bulb
Date: 9/18/2015
Abstract: An article about Unicode, which mentions the work of the Script Encoding Initiative (Universal Scripts Project) and its work to get scripts.into the Unicode Standard

Associated Products

Unlocking the Mayan Script with Unicode (Conference Paper/Presentation)
Title: Unlocking the Mayan Script with Unicode
Author: Carlos Pallán Gayol
Author: Deborah Anderson
Abstract: The Maya hieroglyphic script and the degree of its visual complexity have proven challenging for standard script-encoding approaches to be applied. A multidisciplinary collaboration established between UC Berkeley's Script Encoding Initiative and the University of Bonn's MAAYA Project aims to employ new methods combining linguistics, Maya epigraphy, digital palaeography and computer vision to overcome some of the major challenges preventing the encoding of Maya hieroglyphs in the Unicode Standard. Encoding the Maya hieroglyphs in Unicode would allow creation of vast open-access Maya hieroglyphic text repositories and libraries, where advanced search and query functionalities and text-mining could be applied. As a result, the ability to render any Maya hieroglyphic text in Unicode could impact the overall accessibility, reproduction, visualization and long-term preservation of the sum of ancient knowledge recorded by the Maya scribes on thousands of texts and inscriptions produced between ca. 250 BC and 1450 AD in Central America.
Date: 07/13/2016
Primary URL:
Primary URL Description: Conference website
Secondary URL:
Secondary URL Description: Full abstract
Conference Name: Digital Humanities 2016

Negotiating the issues of encoding and producing traditional scripts on computers: Working with Unicode (Conference Paper/Presentation)
Title: Negotiating the issues of encoding and producing traditional scripts on computers: Working with Unicode
Author: Deborah Anderson
Author: Stephen Morey
Abstract: Over the past 30 years, developments in computing mean that almost every script and writing system ever created can be coded on a computer, used on Facebook, mobile phones and in emails, and large numbers of documents can be encoded, searched and archived in a range of different scripts. In South and Southeast Asia, there are a large number of different scripts, some used by quite small communities. Since the earlier part of this century, a great effort has been made to encode all of these scripts in the Unicode, a standard that allows for the encoding of any symbols used in writing that can be demonstrated to be in use, or to have been in use in the past. However, negotiating a script into the Unicode is a complex issue, involving considerable technical expertise and knowledge of script encoding principles, things that are difficult enough for an academic linguist but virtually impenetrable for members of the speech communities. Combining our expertise in both script encoding and in linguistics, we will raise issues of community involvement in the process by means of several case studies.
Date: 07/01/2015
Primary URL:
Primary URL Description: Slides of talk