In any number of sites around the world home to ancient artifacts and the remains of once-thriving societies, there are clay tablets in the ground with messages that have been waiting to be read for more than 5,000 years. These messagesâwritten using a reed stylus pressed into the clay to form wedge-shaped marks to create a script known as cuneiformâare typically thought as the earliest examples of written language in human history.
About 600,000 of these cryptic tablets have been unearthed over the past few centuries, and theyâve been steadily piling up in university and museum collections all over the world. From these findings, researchers known as Assyriologists have been able to decode and translate extinct languages and breathe new life back into themâsuch as Akkadian, once spoken by people living in the worldâs first civilizations in Mesopotamia.
However, there are only so many Assyriologists who are capable of translating these ancient texts. As new clay tablets are discovered each year, the task to divine these ancient messages into modern languages becomes more and more daunting.
âThere are very few people who can really read it well,â Shai Gordin, an Assyriologist at Ariel University, told The Daily Beast. âItâs also 3,000 years of written history, right? So if someone can read one period of cuneiform really well, that doesnât necessarily mean they can read the other periods really well.â
So, to help them translate the growing number of ancient languages into modern-day ones, Gordin and his colleagues decided to turn to an advanced piece of emerging technology to help: artificial intelligence. More specifically, the team was able to develop a neural network that is capable of translating Akkadian and Sumerian cuneiform into English. The researchers published a paper of their findings May 2 in the journal PNAS Nexus.
Not only can a model like this help greatly speed up the process of translation for the researchers, it can also give historians and Assyriologists an opportunity to gain new and more in-depth insights into these ancient civilizations. The authors also see potential in democratizing assyriology by giving access to such tools to laypeople.
âWe really wanted to have [Assyriology] be more integrated with technology,â Gordin said. âOn the one hand, it makes the process more seamless and standardized. On the other, itâs actually helping us ask new and exciting questions because it allows us to see patterns we havenât seen before.â
For better or worse, AI can be seen as the next evolution of humanityâs relationship with language. Not only can AI power tools like Google Translate to help people communicate and understand one another no matter what corner of the world they were born in, but large language models (LLM) like OpenAIâs ChatGPT and Googleâs Bard are beginning to fundamentally change the way some people write and engage with the content they see online.
Itâs no real surprise then that AI would eventually come from some of the oldest languages known to humanity. Willis Monroe, an ancient Near East historian at the University of British Columbia who wasnât involved in the study, believes that itâs fitting in an almost poetic sense that Gordin and his colleagues developed an AI model to engage with one of the earliest written languages in human history.
âCuneiform is very, very old,â Monroe told The Daily Beast. âItâs one of the earliest if not the earliest writing scripts in the world. So something thatâs so fun about this is the combination of modern, digital approaches like neural networks and machine learning with the first writing ever. Itâs kind of bookending human written historyâso itâs really exciting work.â
The model itself is an extension of the Babylonian Engine, a platform for digital Assyriology that seeks to merge emerging tech with the study of these early written languages. âIts goal is to integrate artificial intelligence and machine learning models into the actual work that we do as historians and scholars of the past,â Gordin said.
This latest study is a big step towards that goalâalbeit a formidable one considering the limitations of cuneiform. For one, the script is akin to âthree-dimensional handwriting,â Monroe said. âItâs like trying to have a computer automatically generate a translation of someone writing in French on a beach with a stick. Itâs very complex.â
He added that while there are teams working on models that can directly translate cuneiform, itâs still a long way off as the script is meant to be read in âshifting light.â Thatâs an inherent limitation for even the most powerful computer.
So to get around this, the team actually trained two versions of the neural network. The first converts Akkadian thatâs been transliterated in the Latin text to English (T2E), while the other translates unicodeâan international encoding standard that assigns numbers to scripts and lettersârepresentations of cuneiform into English (C2E). While not directly using the cuneiform, this method allowed researchers to create models that effectively translated the Akkadian.
To measure the success of the translations, the team used the Best Bilingual Evaluation Understudy (BLEU), a rubric that measures the accuracy of translation. The T2E resulted in the most accurate translations of the two modelsâachieving an average score of 37.47âwhile C2E resulted in an average score of 36.52. These are both relatively high scores on the BLEU and indicate that the model was capable of creating understandable translations.
While the neural network shows a lot of promise, there were some caveats. For one, the model is prone to hallucination, a perennial problem in LLMs and other generative AI where the system makes up inaccurate or entirely false answers. Gordin noted that this would often happen when the model attempted to translate text longer than 118 characters.
For example, one translation that they put through the neural network was âIf the day of disappearance of the moon reaches its normal length: the days of the ruler will be long.â However, the model translated it as âIf the day reaches its normal length: a reign of long days.â
âSo it's very close, but not as accurate as a human translator,â Gordin said.
Gordin said this example underscores the importance of always keeping a human in the loop when it comes to this toolâand other AIs like it. With the proliferation of LLMs, these systems need to be viewed as tools to assist actual flesh-and-blood people in their work rather than replace them entirely. As useful as the Akkadian neural network is when it comes to the work that Gordin and his fellow Assyriologists are doing, it still canât outright replace a human scholarâs intuition and oversight.
âThereâs still a human element,â Monroe said. âIn terms of translating the world of Akkadian, this is very effective. It removes one or two steps from the processâbut there's still humans that have to actually hold the clay tablets and study them.â
You might not realize it, but you owe a lot to the Mesopotamians. Everything from our numerical system, to our knowledge of astronomy, to our system of the rule of law can be traced back to this very ancient civilization and their writing system. Thatâs why Assyriologists like Gordin and Monroe work to translate these tabletsâso we might be able to learn more about the origins that led to our current state of the world.
But thereâs still a lot to be done even with an AI model to help lighten the load. As weâve seen with just the past few months since the release of ChatGPT, though, technology can move at eye-watering speedsâand thatâs especially the case with artificial intelligence.
According to Gordin, the model can be upgraded and trained with different periods of cuneiform. This will allow it to translate an even greater corpus of Akkadian, and gain even more insight into these ancient cultures. Moreover, the team plans on putting the model on the Babylonian Engine so more people can access it.
âWe want to make it more accessible to people who are not necessarily historians or Assyriologists,â Gordin said. âWe want to make it easier for them to pick up a text and get a translation through this model.â
This democratization of the research could help unearth even more insights into these ancient civilizations and how their people lived. Meanwhile, training neural networks on these languages help refine and train them in ways that they wouldnât necessarily be able to by translating âlivingâ languages. In that way, a very, very old and extinct language is still capable of teaching us new things about life today.
âThereâs amazing things that we find in cuneiform,â Monroe said. âThere's all these things that really resonate with us now, and show the commonality that we have with ancient people that are so distant, not only language, but also time.â