Gemma Boleda

Gemma Boleda

Universitat Pompeu Fabra

Engineering Sciences

I am an ICREA Research Professor in the Department of Translation and Language Sciences of the Universitat Pompeu Fabra, where I co-lead the Computational Linguistics and Linguistic Theory (COLT) research group. I previously held post-doctoral positions at the the Computer Science department of Universitat Politècnica de Catalunya (Spain), the department of Linguistics of The University of Texas at Austin (USA), and the CIMEC Center for Brain/Mind Sciences of the University of Trento (Italy). Before that, I graduated in Spanish Philology at Universitat Autònoma de Barcelona and obtained my PhD in Cognitive Science and Language at Universitat Pompeu Fabra (both in Spain). I was also a visiting researcher at the Computational Linguistics & Phonetics department (CoLi) of Saarland University and the Institute for Natural Language Processing (IMS) of the University of Stuttgart, both in Germany.

Research interests

I want to understand how language works; in particular, how humans convey meaning through language, how the formal properties of language support communication, and how languages are shaped by both cognitive and communicative factors. I study these dynamics in a range of domains and phenomena, with special emphasis on the lexicon (vocabulary), and I investigate which aspects are universal across languages, and what governs variation. My team and I work with a cross-disciplinary approach that integrates methodologies from Linguistics, Artificial Intelligence, and Cognitive Science. Our approach requires large amounts of data, and part of our work involves gathering linguistic data on a large scale.

Selected publications

- Gualdoni E & Boleda G 2024, 'Why do objects have many names? A study on word informativeness in language use and lexical systems'. Proceedings of EMNLP, 18150-18163.
- Liao X, Boleda G, Rohde H & Mayol L 2024, 'Comparing models of pronoun production and interpretation via observational and experimental evidence',Glossa: a journal of general linguistics', vol. 9, no. 1
- Chen Z, Mädebach A, Gualdoni E & Boleda G 2024, "On the Use of Language and Vision Models for Cognitive Science: The Case of Naming Norms", Proceedings of CogSci, 46, 6040-6047.

Selected research activities

Dissemination
  • Keynote: Pressures on the lexicon and their effects. 16th International Conference on Computational Processing of Portuguese (PROPOR 2024). Santiago de Compostela, Spain.
  • Invited talk: On word meaning, word use, and creativity. NCCR Interdisciplinary Workshop 2024: Finding Interdisciplinary Ground for Empirical Work on Meaning. Neuchâtel, Switzerland.
  • Invited talk: Dades i eines d'accés obert: clau per a la diversitat lingüística a Europa. Campus NPLD-Coppetiers on language policy, Institut d’Estudis Catalans, Barcelona, Spain.
  • Plus 4 invited talks at scientific institutions.
PhD theses defended
  • Eleonora Gualdoni, 2024, Universitat Pompeu Fabra (supervision). Moved to intern at Apple Machine Learning Research.
  • Xixian Liao, 2024, Universitat Pompeu Fabra (co-supervision). Moved to research engineer at Barcelona Supercomputing Center.
New post-docs with competitive funding
  • Iria De Dios, Juan de la Cierva, 2024-2025
  • Carmen Saldana, Beatriu de Pinós, 2024-2027
Resource construction
  • Released v. 2.2 of ManyNames, a dataset for object naming (25,000 images). Open data.
Event organization
  • Workshop on Anaphora and Predictability, 29/04/24
  • 2024 CORE Project Workshop Unpacking Efficient Communication: The Roles of Cognitive Bias and Extralinguistic Context in Referring Expression Choice, 18-19/4/2024