Abstract Wikipedia and Vastly Multilingual Natural Language Generation
- Event: Seminar
- Lecturer: Aarne Ranta, University of Gothenburg & Chalmers
- Date: 08 December 2021
- Duration: 2 hours
- Venue: Gothenburg and Online
Abstract: “Abstract Wikipedia is an initiative from the Wikimedia Foundation to generate Wikipedia articles from an abstract (i.e. language-neutral) source in multiple languages. The goal has been set to 20 million articles in over 300 languages, guaranteed to be in synchrony with up-to-date information and thereby with each other. This is by far the largest Natural Language Generation (NLG) project of all times. Grammatical Framework (GF), with 40 languages and specialized domains such as science, law, and e-commerce, is orders of magnitude smaller. Nevertheless, GF has served as inspiration for Abstract Wikipedia, and pilot projects have started to scale it up to the task. Research in both NLG techniques, language resources, processing algorithms, and interaction with human authors is needed. This talk will outline a possible way to build up Abstract Wikipedia by starting with simple text-robot-like techniques and proceeding to more sophisticated NLG.”