CLASP
The Centre for Linguistic Theory and Studies in Probability

Information processing and cross-linguistic universals

Finding explanations for the observed variation in human languages is the primary goal of linguistics, and promises to shed light on the nature of human cognition. One particularly attractive set of explanations is functional in nature, holding that language universals are grounded in the known properties of human information processing. The idea is that grammars of languages have evolved so that language users can communicate using sentences that are relatively easy to produce and comprehend. In this talk, I summarize results from explorations in two linguistic domains, from an information-processing point of view.

First, I consider communication-based origins of lexicons of human languages. Chomsky has famously argued that this is a flawed hypothesis, because of the existence of such phenomena as ambiguity. Contrary to Chomsky, we show that ambiguity out of context is not only not a problem for an information-theoretic approach to language, it is a feature. Furthermore, word lengths are optimized on average according to predictability in context, as would be expected under an information theoretic analysis. We then apply this simple information-theoretic idea to a well-studied semantic domain: words for colors. And finally, I show that all the world’s languages that we can currently analyze minimize syntactic dependency lengths to some degree, as would be expected under information processing considerations.

Readings:

Piantadosi, S.T., Tily, H. & Gibson, E. (2012). The communicative function of ambiguity in language.Cognition 122: 280-291. http://tedlab.mit.edu/tedlab_website/researchpapers/Piantadosi_et_al_2012_Cogn.pdf

Piantadosi, S.T., Tily, H. & Gibson, E. (2011). Word lengths are optimized for efficient communication.Proceedings of the National Academy of Sciences 108(9): 3526-3529. http://tedlab.mit.edu/tedlab_website/researchpapers/Piantadosi_et_al_2011_PNAS.pdf

Futrell, R., Mahowald, K., & Gibson, E. (2015). Large-scale evidence of dependency length minimization in 37 languages. Proceedings of the National Academy of Sciences 112(33): 10336-10341. doi: 10.1073/pnas.1502134112. http://tedlab.mit.edu/tedlab_website/researchpapers/Futrell_et_al_2015_PNAS.pdf