CLASP
The Centre for Linguistic Theory and Studies in Probability

Visualise my Corpus

Abstract: Are you a researcher (postgrad or staff) working with large data sets, critical discourse analysis, and/or use tools to analyse and interpret your data? Have you encountered limits of visualising your results? Digital Humanities, Social Sciences, and Natural Language Processing (NLP) all use computational methods for corpus research but face similar problems when critically interpreting and subsequently visualising their empirical data. In this presentation, I will start by taking you through some of the online tools and software that you can use to visualise your corpus. The second part of the talk will be a step-by-step Python-3 tutorial for those who want an easy start to getting data from Twitter and applying some common techniques in NLP, Text Analysis, Machine Learning, Topic Modelling, and Corpus Linguistics. The Python tutorial will be made available on GitHub so you can practice those in your spare time.