Gemma Boleda

Gemma Boleda

Universitat Pompeu Fabra

Engineering Sciences

I am an ICREA Research Professor in the Department of Translation and Language Sciences of the Universitat Pompeu Fabra, where I co-lead the Computational Linguistics and Linguistic Theory (COLT) research group. I previously held post-doctoral positions at the the Computer Science department of Universitat Politècnica de Catalunya (Spain), the department of Linguistics of The University of Texas at Austin (USA), and the CIMEC Center for Brain/Mind Sciences of the University of Trento (Italy). Before that, I graduated in Spanish Philology at Universitat Autònoma de Barcelona and obtained my PhD at Universitat Pompeu Fabra (both in Spain). I was a a visiting researcher at the Computational Linguistics & Phonetics department of Saarland University and the Institute for Natural Language Processing (IMS) of the University of Stuttgart, both in Germany.

Research interests

I want to understand how language works; in particular, how humans convey meaning through language. The focus of my research is how the linguistic system and its use in communication influence each other. For instance, a speaker of English can use different expressions (e.g. "the dog/chihuahua/small dog") when referring to a given chihuahua. The choice depends, a.o., on the words and grammar available in the language and the properties of the object. In turn, over time, specific speaker choices in communicative situations change the system itself. I study these dynamics in a range of semantic phenomena; which aspects are universal across languages; and what governs variation.

My team and I address these research questions with a cross-disciplinary approach that integrates methodologies from Linguistics, Artificial Intelligence, and Cognitive Science. Our approach requires large amounts of data, and part of our work involves gathering linguistic data on a large scale.

Selected publications

- Brochhagen T & Boleda G 2022, 'When do languages use the same word for different meanings? The Goldilocks Principle in colexification', Cognition, vol. 226, 105179.

- Gualdoni E, Mädebach A, Brochhagen T & Boleda G 2022, Woman or tennis player? Visual typicality and lexical frequency affect variation in object naming, Proc of the Annual Meeting of the Cognitive Sci Society, 44, pp 990-996.

- Mädebach A, Gualdoni E, Torubarova E & Boleda G 2022, 'Effects of task and visual context on referring expressions using natural scenes', Procof the Annual Meeting of the Cognitive Sci Society, 44, pp 3188-3194.

- Sorodoc I, Aina L & Boleda G 2022, Challenges in including extra-linguistic context in pre-trained language models, Proc of the Third Workshop on Insights from Negative Results in NLP, pp 134-138.

Selected research activities


  • Keynote speaker, Amsterdam Colloquium, The Netherlands
  • Two invited talks in scientific seminars (U. Pompeu Fabra, U. Toronto)

Student supervision, completed in 2022:

  • 2 PhD theses (Laura Aina and Ionut-Teodor Sorodoc, both U. Pompeu Fabra)
  • PhD thesis committee member (Timothee Mickus, U. de Lorraine, France)
  • 3 master's theses
  • 1 research assistant

Scientific organization: chair of two mini-workshops (around 50 attendees each; hybrid format; U. Pompeu Fabra) on:

  • Referential Information in Deep Learning Models
  • Linguistic Ambiguity and Deep Learning


  • Standing review committee member, TACL journal
  • Area Chair for Lexicon & Semantics, COLING 2022
  • Advisor, ERC proposals, Spanish Science and Technology Foundation (FECYT)
  • Reviewing, journals: Applied Linguistics, Linguistics, Dialogue and Discourse, Cognitive Science, Corpus Linguistics and Linguistic Theory; plus two conferences and a summer school

Outreach: activities fostering the presence of women in science:

  • Co-organizer, Jornada STEM: Visibilitzem l'Enginyeria [STEM Day: Making Engineering Visible], Barcelona, Feb 2022
  • Mentor, Amsterdam Colloquium 2022 Pop-Up Mentoring Program (PUMP)
  • Mentor, CogSci 2022 Mentoring Program