humanities and the artsNews

New EU-backed project aims to expand Estonian linguistics research capabilities

Photo credit: Jevegeni Salikhov
Photo credit: Jevegeni Salikhov

A research consortium led by investigators at the University of Tartu has been awarded roughly €1 million in European Union funding to support the development of methodological expertise in linguistics. The official title of the effort is Methodological Excellence in Data-Driven Approaches to Linguistics, abbreviated as MEDAL, and it is set to commence in January 2023 and run through the end of December 2025.

MEDAL includes researchers from the Max Planck Institute for Psycholinguistics and Donders Institute at Radboud University, both located in Nijmegen, the Netherlands. It also involves investigators at the University of Birmingham in the UK, who are being funded via a separate UK mechanism. As such, the budget for MEDAL is expected to be about €1.5 million.

According to Virve Vihman, associate professor of psycholinguistics at the University of Tartu, MEDAL is funded by the European Commission’s Twinning program. It aims to share knowledge between institutions with experience in data-driven linguistics and to invest in Tartu’s own resources. 

“The whole point of the EC’s Twinning scheme is to increase scientific capacity and to raise the competence and visibility of widening countries, Estonia being the widening country in this project,” said Vihman. Indeed, by the time that MEDAL runs its course, the University of Tartu expects to be an “internationally recognized hub of linguistics research excellence for Northern Europe,” according to the project’s summary.

To accomplish this, the University of Tartu plans to train a new generation of researchers in the most cutting edge research methods. It will invest in methodological training, and develop a model of cross-linguistic, cross-modal, and cross-disciplinary research. Three clusters of methods in particular will be explored: corpus studies, experimental methods, and computational modelling.

The University of Tartu’s Virve Vihman and Josh Wilbur. Photo credit: Lauri Kulpsoo

According to Vihman, linguistics has become more empirical. Corpus linguistics refers to a large sample of actual language that one can run analysis on. It could be, for example, all Estonian-language web pages from a given year, transcribed recordings of conversations, or the translation of one book into several languages, thus allowing cross-linguistic comparisons of the same core text.

“With corpus linguistics, you can digitally process huge amounts of text that would otherwise not be feasible within your lifetime,” noted Joshua Wilbur, who will be lecturer of digital linguistics at the University of Tartu when the project begins in January. “If I wanted to, I could, in a matter of seconds, search for the number of times each word appears in a corpus of, for instance, modern Estonian novels,” he said. “These are things that just were not possible before.”

There are other methods that the researchers would like to develop further competence in, said Vihman, such as eye tracking and machine learning. Eye-tracking has gained popularity in psycholinguistics research as a means to track speaker’s attention while they process written or spoken language. There is also an emphasis on cross-linguistic research, which, as it sounds, involves research involving multiple languages. Here, including Estonian in such comparative research is another opportunity to improve the overall understanding of human language

“There is a disproportionate amount of linguistics research focused on English, and Indo-European languages in general,” said Vihman. “The consortium aims to highlight the need for rigorous methods for comparing across languages in order to better understand what languages have in common and how they vary, enabling us to make generalisations about human cognitive and linguistic abilities,” she said.

Seminars and short-term student and staff exchanges will form the backbone of MEDAL. “Especially for doctoral students it is useful to have contacts in the global and European research community,” said Vihman of the envisioned exchanges. Wilbur noted that MEDAL seeks not only to improve research, but also how linguistics projects are administered and managed by institutions.

There will also be a research-oriented project, called Gold MEDAL, that will allow scientists to hone the new skills they are learning. Gold is an acronym for Generalizing Over Language Data. Yet MEDAL is less about achieving specific research aims and more about “exposing young scholars in Europe to a toolkit of methods,” Vihman said. “The research is intended as a chance for junior scholars to try their hand and experience being part of an international research project, collaborating with leaders in the field.”

She stressed that Tartu, which has engaged in linguistics research since the 19th century, already has a strong foundation in linguistics, and will build on this base by leading MEDAL.

“We are developing what we already have, but using this new funding scheme to inject a lot of new expertise, connections, and inspiration for people to use as a springboard,” commented Vihman. “We couldn’t do it if we were starting from zero.”

Written by: Justin PetroneThis article was funded by the European Regional Development Fund through Estonian Research Council.

Read more

Get our monthly newsletterBe up-to-date with all the latest news and upcoming events