Measuring Linguistic Diversity During COVID-19

Type of content
Conference Contributions - Published
Publisher's DOI/URI
Thesis discipline
Degree name
Publisher
Journal Title
Journal ISSN
Volume Title
Language
Date
2020
Authors
Dunn J
Coupe T
Adams B
Abstract

Computational measures of linguistic diversity help us understand the linguistic landscape using digital language data. The contribution of this paper is to calibrate measures of linguistic diversity using restrictions on international travel resulting from the COVID-19 pandemic. Previous work has mapped the distribution of languages using geo-referenced social media and web data. The goal, however, has been to describe these corpora themselves rather than to make inferences about underlying populations. This paper shows that a difference-indifferences method based on the Herfindahl Hirschman Index can identify the bias in digital corpora that is introduced by non-local populations. These methods tell us where significant changes have taken place and whether this leads to increased or decreased diversity. This is an important step in aligning digital corpora like social media with the real-world populations that have produced them.

Description
Citation
Dunn J, Coupe T, Adams B (2020). Measuring Linguistic Diversity During COVID-19. The Fourth Workshop on Natural Language Processing and Computational Social Science. 19/11/2020-20/11/2020. Proceedings of The Fourth Workshop on the Fourth Workshop on Natural Language Processing and Computational Social Science.
Keywords
Ngā upoko tukutuku/Māori subject headings
ANZSRC fields of research
Fields of Research::47 - Language, communication and culture::4704 - Linguistics::470404 - Corpus linguistics
Fields of Research::47 - Language, communication and culture::4704 - Linguistics::470403 - Computational linguistics
Fields of Research::47 - Language, communication and culture::4704 - Linguistics::470411 - Sociolinguistics
Rights
All rights reserved unless otherwise stated