Program

Digital Humanities: Digital Humanities Advancement Grants

Period of Performance

10/1/2017 - 3/31/2020

Funding Totals

$75,000.00 (approved)
$75,000.00 (awarded)


Visualizing Webpage Changes Over Time

FAIN: HAA-256368-17

Old Dominion University Research Foundation (Norfolk, VA 23508-0369)
Michele C. Weigle (Project Director: January 2017 to February 2022)
Deborah Kempe (Co Project Director: July 2017 to February 2022)
Pamela Graham (Co Project Director: July 2017 to February 2022)
Alexander Thurman (Co Project Director: July 2017 to February 2022)
Michael L Nelson (Co Project Director: July 2017 to February 2022)

Participating institutions:
Old Dominion University Research Foundation (Norfolk, VA) - Applicant/Recipient
New York Art Resources Consortium (New York, NY) - Participating Institution
Trustees of Columbia University in the City of New York (New York, NY) - Participating Institution

The development of prototypes for a set of open-source visualization tools to ease navigation of web archive collections. Partners include the New York Art Resources Consortium and Columbia University Libraries.

As web archives grow in importance and size, techniques for understanding how a web page changes through time need to adapt from an assumption of scarcity (just a few copies of a page, no more than a few weeks or months apart) to one of abundance (tens of thousands of copies of a page, spanning as much as 20 years). Old Dominion University, New York Art Resources Consortium (NYARC), and Columbia University Libraries (CUL) will jointly research and develop tools for efficient visualization of and interaction with archived web pages. We will develop 1) a tool for visualizing web page changes in arbitrary web archives, 2) a plug-in for the popular Wayback Machine web archiving system (for better support of the functionality otherwise available via #1), and 3) scripts for easy embedding of the visualizations in live web pages, providing tighter integration of the archived web and live web. This work will be informed and in support of CUL's and NYARC's existing web archiving activities.





Associated Products

2018-03-12: NEH ODH Project Directors' Meeting (Blog Post)
Title: 2018-03-12: NEH ODH Project Directors' Meeting
Author: Michele Weigle
Abstract: This blog post is a trip report from the NEH ODH Project Directors' Meeting in February 2018. It includes slides presented as part of our lightning talk on the "Visualizing Webpage Changes Over Time" project.
Date: 03/12/2018
Primary URL: http://ws-dl.blogspot.com/2018/03/2018-03-12-neh-odh-project-directors.html
Blog Title: WS-DL Blog

How I Changed Over Time: A webservice to summarize TimeMaps based on SimHashed HTML content (Report)
Title: How I Changed Over Time: A webservice to summarize TimeMaps based on SimHashed HTML content
Author: Maheedhar Gunnam
Abstract: With the increase in the dynamic nature of the web, often the content of a web page grows, changes, and might be shrunk. And with these pages being archived numerous times, they serve as the digital history for those changes that are long gone from the live page. But visualizing over these numerous different archived copies, or mementos, with the intention of perceiving the major changes over time is nearly impossible, as the memento count can be very high. In case of cnn.com, the web page has been archived 188,966? ?times?. This TimeMap summarization tool referenced throughout this paper as ‘tmvis’, facilitates visualization of these changes by analyzing all mementos in a TimeMap and picking the most unique mementos, which best describe the major changes in a webpage. A web service with a user friendly interface and command line tools are also provided for this tool.
Date: 05/04/2018
Primary URL: http://www.cs.odu.edu/~mweigle/papers/gunnam-ms-proj-18.pdf
Primary URL Description: Masters' Project report, Old Dominion University, May 2018
Access Model: open access

Visualizing Webpage Changes Over Time With TMVis (Blog Post)
Title: Visualizing Webpage Changes Over Time With TMVis
Author: Dhruv Patel
Author: Abigail Mabe
Abstract: This blog post is a description of the project and contains a system walkthrough video and links to the source code.
Date: 2020-05-21
Primary URL: https://ws-dl.blogspot.com/2020/05/2020-05-21-visualizing-webpage-changes.html
Blog Title: Visualizing Webpage Changes Over Time With TMVis

TMVis source code (Computer Program)
Title: TMVis source code
Author: Abigail Mabe
Author: Dhruv Patel
Author: Maheedhar Gunnam
Author: Surbhi Shankar
Author: Mat Kelly
Author: Sawood Alam
Abstract: Source code for the TMVis project
Year: 2020
Primary URL: https://github.com/oduwsdl/tmvis
Primary URL Description: Source Code, hosted on GitHub
Secondary URL: http://tmvis.cs.odu.edu/
Secondary URL Description: Demo server
Access Model: open-source
Programming Language/Platform: Node, JavaScript
Source Available?: Yes

TMVis screencast demo video (Film/TV/Video Broadcast or Recording)
Title: TMVis screencast demo video
Writer: Dhruv Patel
Writer: Abigail Mabe
Abstract: Demo video of the TMVis project
Year: 2020
Primary URL: https://www.youtube.com/watch?v=iAx9DUC1yus
Primary URL Description: YouTube video
Access Model: open access via YouTube
Format: Web

Visualizing Webpage Changes Over Time (Report)
Title: Visualizing Webpage Changes Over Time
Author: Michele C. Weigle
Author: Abigail Mabe
Author: Dhruv Patel
Author: Maheedhar Gunnam
Author: Surbhi Shankar
Author: Mat Kelly
Author: Sawood Alam
Author: Michael L. Nelson
Abstract: We report on the development of TMVis, a web service to provide visualizations of how individual webpages have changed over time. We leverage past research on summarizing collections of webpages with thumbnail-sized screenshots and on choosing a small number of representative past archived webpages from a large collection. We offer four visualizations: image grid, image slider, timeline, and animated GIF. Embed codes for the image grid and image slider can be produced to include these on separate webpages. The animated GIF can be downloaded as an image file for the same purpose. This tool can be used to allow scholars from various disciplines, as well as the general public, to explore the temporal nature of web archives. We hope that these visualizations will just be the beginning and will provide a starting point for others to expand these types of offerings for users of web archives.
Date: 06/02/2020
Primary URL: https://arxiv.org/abs/2006.02487
Primary URL Description: ArXiv tech report
Access Model: open access