CLASP
The Centre for Linguistic Theory and Studies in Probability

Simplifying Documents

Abstract: To date, most work on simplification has focused on sentences. Early attempts at document simplification merely applied these approaches iteratively over the sentences of a document. However, this fails to coherently preserve the discourse structure, leading to suboptimal output quality. In this talk, I will highlight the challenges involved in simplifying documents and argue that both the context and the internal structure of the sentences to be simplified need to be modelled. We will explore various models which exploit document context within the simplification process itself, either by iterating over larger text units or by extending the system architecture to attend over a high-level representation of document context. I will further discuss the performance and efficiency tradeoffs of these system variants making suggestions of when each should be preferred.

(Joint work with Liam Cripwell)