CLASP
The Centre for Linguistic Theory and Studies in Probability

Experiments with category representations

Abstract

In this talk, I will start from the observation that computational models that use distributional semantics / word embeddings make (mostly implicit) use of categorization theory and can be analyzed in those terms. To illustrate this, I will present two studies on (a) classification in terms of prototype and exemplar models [1] and (b) alternatives to the default way of constructing word embeddings from co-occurrences of the category noun [2].

[1] Jennifer Sikos and Sebastian Padó. Frame Identification as Categorization: Exemplars vs Protoypes in Embeddingland. In: Proceedings of IWCS, pages 295-306. Gothenburg, Sweden, 2019. https://doi.org/10.18653/v1/W19-0425

[2] Matthijs Westera, Abhijeet Gupta, Gemma Boleda and Sebastian Padó. Distributional models of category concepts based on names of category members. Cognitive Science, 45(9):e13029, 2021. https://doi.org/10.1111/cogs.13029

VITA: Sebastian Padó is professor of computational linguistics at Stuttgart University since 2013. He studied in Saarbrücken and Edinburgh and was a postdoctoral scholar at Stanford University. His core research concerns learning, representing, and processing semantic knowledge (broadly construed) from and in text. Examples include modeling linguistic phenomena (discourse structure, inference, etc.), applications in the computational social sciences and digital humanities, and methodological aspects such as model interpretation and robustness.