Marco Baroni

Marco Baroni

Universitat Pompeu Fabra

Social & Behavioural Sciences

Marco Baroni received a PhD in Linguistics from the University of California, Los Angeles, in the year 2000. After several experiences in research and industry, he joined the Center for Mind/Brain Sciences of the University of Trento, where he became associate professor in 2013. In 2016, Marco joined the Facebook Artificial Intelligence Research team. In 2019, he became ICREA research professor, affiliated with the Linguistics Department of Pompeu Fabra University in Barcelona. Marco's work in the areas of multimodal and compositional distributed semantics has received widespread recognition, including a Google Research Award, an ERC Starting Grant and and the ICAI-JAIR best paper prize. Marco's current research focuses on a better understanding of artificial neural networks, focusing in particular on what they can teach us about human language acquisition and processing.

Research interests

Marco is interested in human language and how it is acquired. To gain insights into these questions, he develops and studies computational systems, in particular deep neural networks, that acquire aspects of language from realistic input data. By analyzing the inner dynamics and external behaviour of these systems, we can gain insights into questions such as: how much linguistic knowledge is already implicitly present in input distributions, what are the minimal priors necessary for learning, what is the space of solutions to the communication challenges that led to the evolution of language, and so on. The ultimate goal of Marco’s research is to bring about a more precise characterization of what is unique about the human language faculty.

Selected publications

– Lake B, Linzen T & Baroni M 2019, ‘Human few-shot learning of compositional instructions’. Proceedings of CogSci, 41st Annual Meeting of the Cognitive Science Society. (Accepted as oral presentation).

– Chaabouni R, Kharitonov R, Lazaric A, Dupoux E & Baroni M 2019, ‘Word-order biases in deep-agent emergent communication’, Proceedings of ACL, , 5166 – 5175. (57th Annual Meeting of the Association for Computational Linguistics).

– Bouchacourt D & Baroni M 2019, ‘Miss Tools and Mr Fruit: Emergent communication in agents learning about object affordances’. Proceedings of ACL 2019 (57th Annual Meeting of the Association for Computational Linguistics), East Stroudsburg PA: ACL.

– Kharitonov E, Chaabouni R, Bouchacourt D & Baroni M 2019 EGG: a toolkit for research on Emergence of lanGuage in Games. Proceedings of  EMNLP 2019 (Conference on Empirical Methods in Natural Language Processing).

– Hahn M & Baroni M 2019. ‘Tabula nearly rasa: Probing the linguistic knowledge of character-level neural language models trained on unsegmented text’, Transactions of the Association for Computational Linguistics, vol. 7, pp 467-484.

– Chaabouni R, Kharitonov E, Dupoux E & Baroni M 2019, ‘Anti-efficient encoding in emergent communication.’ Proceedings of NeurIPS (33d Conference on Neural Information Processing Systems), Vancouver, BC: Neural Information Processing Systems Foundation.

Selected research activities

– 8 invited talks, including:

  • Inaugural lecture at the CCIL master in Barcelona;
  • Plenary talk at the 7th Cambridge Neuroscience Symposium;
  • Plenary talk at the 4th Workshop on Representation Learning for NLP.
  • Co-organized the 2-days Understanding Human and Machine Intelligence Workshop at Facebook New York.

Co-organized a week-long workshop on Compositionality in Humans and Machines at the Leiden Lorentz Center.

Open-sourced the EGG toolkit ( (>100 stars on GitHub as of January 2020).

Member of the ERC Consolidator Grant SH4 Panel (The Human Mind and Its Complexity).