A Lexical Distance Study of Arabic Dialects
- Event: Seminar
- Lecturer: Kathrein Abu Kwaik,
- Date: 13 February 2019
- Duration: 2 hours
- Venue: Gothenburg
We conduct a computational cross dialectal lexical distance study to measure the similarities and differences between the Arabic dialects and the MSA. We exploit several methods from Natural Language Processing (NLP) and Information Retrieval (IR) like Vector Space Model (VSM), Latent Semantic Indexing (LSI) and Hellinger Distance (HD), and apply them on different Arabic dialectal corpora. We measure the overlap among all the dialects and compute the frequencies of the most frequent words in every dialect. The results are informative and indicate that Levantine dialects are very similar to each other and furthermore, that Palestinian appears to be the closest to MSA.