Bridging Millennia: Machine Learning's Impact on Assyriology

You can also join remotely.

The field of Assyriology has recently undergone a silent transformation with the integration of machine learning techniques. In this presentation, we will explore how machine learning is being used to study artifacts from Mesopotamia which were inscribed using the cuneiform script. Cuneiform, which was employed for thousands of years to record a handful of languages in the Middle East, presents both opportunities and challenges for computer-aided processing and analysis. From various natural language processing applications, including machine translation, to computer vision, researchers have found clever ways to use machine learning as a tool in their work to understand both the material artifacts themselves and the text they were inscribed with. The overall challenge in these applications has been data sparsity and a lack of training data. After discussing how these challenges were addressed and the current state-of-the-art, we will end with a few words about expectations in the development of technologies for the study of these artifacts in the future.

Émilie Pagé-Perron is a Junior Research Fellow of Assyriology at Wolfson College. She is an Assyriologist and Digital Scholar. Her research interests encompass Mesopotamian social history, Sumerian philology, and Computational Linguistics of cuneiform languages. Émilie employs both traditional philological approaches and computational methods in her work. Although her education is based in the Humanities, she has perfected her scientific skills, focusing on data management and curation and natural language processing. Émilie has coordinated the Machine Translation and Automated Analysis of Cuneiform Languages project (2017-2020), a UCLA-Toronto-Frankfurt project funded by the NEH, SSHRC and the DFG through the T-AP Digging into Data Challenge. She also has managed the CDLI Framework Update project (2017-2020, NEH). She continues to work on ways to render ancient cuneiform texts more accessible to varied audiences.