Can LLMs process Indigenous Languages? An Exploration of AI for language documentation with Bribri and Cook Islands Māori

Event: Joint Linguistics-CLASP Seminar
Presented by: Rolando Coto Solano from Dartmouth College
Date: 23 September 2025
Time: 13:15-15:00
Venue: Gothenburg University, Humanisten and online
Address: Renströmsgatan 6, 412 55 Göteborg
Room: J309
Zoom link: https://gu-se.zoom.us/j/67063108947?pwd=kPpjvMLCekxNTBVzq4uYP5gFZ6Y6vd.1
Slides:

Abstract

The performance of Large Language Models (LLMs) for tasks involved in language documentation, such as transcription, translation and data analysis, is high when it comes to widely spoken languages. However, LLMs continue to show gaps in reliability with languages at the lower end of the resource spectrum. Here we will explore the performance of cutting-edge LLMs in documentation tasks with two languages: Bribri (Costa Rica) and Cook Islands Māori. We will also revisit efforts to increase not just their documentation and increase AI performance, but also to increase the role of community members in the documentation efforts.