Marco Baroni received a PhD in Linguistics from the University of California, Los Angeles, in the year 2000. After several experiences in research and industry, he joined the Center for Mind/Brain Sciences of the University of Trento, where he became associate professor in 2013. In 2016, Marco joined the Facebook Artificial Intelligence Research team. In 2019, he became ICREA research professor, affiliated with the Linguistics Department of Pompeu Fabra University in Barcelona. Marco's work in the areas of multimodal and compositional distributed semantics has received widespread recognition, including a Google Research Award, an ERC Starting Grant, and the ICAI-JAIR best paper prize and the ACL test of time award. Marco's current research focuses on understanding communication in communities of deep-learning-trained artificial neural networks. In 2021, he was awarded an ERC Advanced Grant to work on this topic.
Research interests
While deep-learning-based artificial neural networks have revolutionized science, engineering and our daily life, we still know surprisingly little about how they work. Indeed, they can sometimes behave in completely unexpected ways, exposing weaknesses that can be used for harmful purposes. My current interest lies in opening the black box of modern neural networks, specifically in the domain of language, where I study so-called "large language models" such as ChatGPT. I am focusing on two main approaches to this challenge. On the one hand, I am studying how large language models react to inputs that are outside their training distribution. On the other, I use tools from probability, linear algebra and information theory to measure the complexity of the internal representations of large language models.
Selected publications
- Rakotonirina N & Baroni M, 2024, "MemoryPrompt: A Light Wrapper to Improve Context Tracking in Pre-trained Language Models", Proceedings of LREC-Coling.
Selected research activities
Invited talks:
- "Unnatural Language Processing: On the puzzling out-of-distribution behaviour of language models", HiTZ: Basque Center for Language Technology, Spain
- "The curious case of unnatural language", CIMeC: Center for Brain/Mind Sciences, Italy
- "The curious case of unnatural language" SISSA: Scuola Internazionale Superiore di Studi Avanzati, Italy
Dissemination:
- Invited panelist, “AI and languages” at AIxIA 2024
- Interview about AI and Linguistics for the Group on Speech and Spoken Language of the Italian Linguistics Association
- Invited panelist, "Giornata della Ricerca degli Italiani in Spagna", Italian Embassy, Madrid
- Invited panelist, "Uomo o Macchina? Le Neuroscienze e l'Intelligenza Artificiale", public event in Rovereto, Italy
Service:
- External examiner of two PhD theses (Université de Montréal and La Sapienza University Rome)
- ERC 2023 Synergy Grant Remote Panel Evaluator
- Area chair at NeurIPS 2024
- Area chair at ICLR 2024
- Session chair at Barcelona Deep Learning Symposium 2024