Arts: Conference Contributions
Permanent URI for this collection
Browse
Recent Submissions
Item Open Access Variation and Instability in Dialect-Based Embedding Spaces(Association for Computational Linguistics, 2023) Dunn, JonathanThis paper measures variation in embedding spaces which have been trained on different regional varieties of English while controlling for instability in the embeddings. While previous work has shown that it is possible to distinguish between similar varieties of a language, this paper experiments with two follow-up questions: First, does the variety represented in the training data systematically influence the resulting embedding space after training? This paper shows that differences in embeddings across varieties are significantly higher than baseline instability. Second, is such dialectbased variation spread equally throughout the lexicon? This paper shows that specific parts of the lexicon are particularly subject to variation. Taken together, these experiments confirm that embedding spaces are significantly influenced by the dialect represented in the training data. This finding implies that there is semantic variation across dialects, in addition to previously studied lexical and syntactic variation.Item Open Access Te whakaoho o te mōhiotanga huna(2022) Hay J; Keegan P; Mattingley W; Todd S; Panther F; King, JeanetteItem Open Access Item Open Access Stability of Syntactic Dialect Classification Over Space and Time(2022) Wong S; Dunn, JonathanThis paper analyses the degree to which dialect classifiers based on syntactic representations remain stable over space and time. While previous work has shown that the combination of grammar induction and geospatial text classification produces robust dialect models, we do not know what influence both changing grammars and changing populations have on dialect models. This paper constructs a test set for 12 dialects of English that spans three years at monthly intervals with a fixed spatial distribution across 1,120 cities. Syntactic representations are formulated within the usage-based Construction Grammar paradigm (CxG). The decay rate of classification performance for each dialect over time allows us to identify regions undergoing syntactic change. And the distribution of classification accuracy within dialect regions allows us to identify the degree to which the grammar of a dialect is internally heterogeneous. The main contribution of this paper is to show that a rigorous evaluation of dialect classification models can be used to find both variation over space and change over time.Item Open Access Towards a theory of motivation: describing commitment to the Māori language(2009) King, Jeanette; Gully, Nichol CatherineItem Open Access Cosmic Xerox Machines, Tattoo Removal, and Defining 'Physicalism'(2018) Campbell, DouglasItem Open Access Greenbeard Theory, Meet Simulation Theory(2020) Campbell, DouglasItem Open Access Robots in Nozickland(2020) Campbell, DouglasItem Open Access Beyond competence - Implications for WIL in inter-professional healthcare practice(2022) Borren J; Maidment J; Tudor, RaewynItem Open Access Unsupervised morphological segmentation in a language with reduplication(2022) Todd S; Huang A; Needle J; King J; Hay, JenniferWe present an extension of the Morfessor Base line model of unsupervised morphological seg mentation (Creutz and Lagus, 2007) that in corporates abstract templates for reduplication, a typologically common but computationally underaddressed process. Through a detailed in vestigation that applies the model to Maori, the ¯ Indigenous language of Aotearoa New Zealand, we show that incorporating templates improves Morfessor’s ability to identify instances of redu plication, and does so most when there are multiple minimally-overlapping templates. We present an error analysis that reveals important factors to consider when applying the extended model and suggests useful future directions.Item Open Access Taiwan: Party system of a young consolidated democracy(2021) Tan, AlexItem Open Access Uninvited Campaign Rally: Effects of Hong Kong’s Anti-Extradition Movement on Taiwan’s 2020 Election(American Political Science Association, 2021) Huang C; Tan, AlexParty, candidate, and issue are undoubtedly the most frequently cited elements in electoral studies. All three, especially party system and issue debates, often reflect and cut along the main social and political cleavages in a society. However, the classic Michigan model and social cleavage theory may overlook the influence of events beyond the country’s border. It is curious that recent literature began to recognize subtle foreign intervention through internet and social media, yet few pay enough attention to the possible effects of intensively reported external events on domestic politics and their interactions. This study fills this void by studying an Asian new democracy and examining how events hundreds of miles away can send shock waves to impact, if not to reverse, the domestic public mood. We examine the effects of Hong Kong’s anti-extradition movement in 2019 on Taiwan voters’ views of cross-strait relationship, especially the stands on Taiwan independence vs. unification with China. We utilize the unique face-to-face survey panel data collected by the Taiwan Institute for Governance and Communication Research (TIGCR) at the National Chengchi University from 2018 to 2020 (TIGCR-PPS 2018, 2019 & 2020) to measure the stability and change of independence-unification views in Taiwan during the 2019 campaign period. We find that the shift of general public’s attitude in this long-existing political cleavage on cross-strait relations indeed accounts for Taiwan’s 2020 presidential election results.Item Open Access Learned Construction Grammars Converge Across Registers Given Increased Exposure(Association for Computational Linguistics, 2021) Tayyar Madabushi H; Dunn, JonathanThis paper measures the impact of increased exposure on whether learned construction grammars converge onto shared representations when trained on data from different registers. Register influences the frequency of constructions, with some structures common in formal but not informal usage. We expect that a grammar induction algorithm exposed to different registers will acquire different constructions. To what degree does increased exposure lead to the convergence of register-specific grammars? The experiments in this paper simulate language learning in 12 languages (half Germanic and half Romance) with corpora representing three registers (Twitter, Wikipedia, Web). These simulations are repeated with increasing amounts of exposure, from 100k to 2 million words, to measure the impact of exposure on the convergence of grammars. The results show that increased exposure does lead to converging grammars across all languages. In addition, a shared core of register-universal constructions remains constant across increasing amounts of exposure.Item Open Access Virtues, vices and place attachment(2021) Mason, CarolynThere is a virtue associated with forming and maintaining relationships to places. This virtue has not been recognised by philosophers, but it plays a role in indigenous cultures across the world. Hence, place attachment is one of many areas in which indigenous knowledge can contribute to the development of Western philosophy. After explaining what it means for a disposition to act in accordance with this virtue to be a Neo-Aristotelian virtue, examples from Māori culture are used to explain why the way that people form relationships to places can be a virtue in this neo-Aristotelian sense. Recognising this virtue reveals ways of interacting with the world that contribute to human and environmental flourishing, as well as revealing a new way in which indigenous people are harmed when dispossessed of their ancestral land.Item Open Access Production vs Perception: The Role of Individuality in Usage-Based Grammar Induction(Association for Computational Linguistics, 2021) Nini A; Dunn, JonathanThis paper asks whether a distinction between production-based and perception-based grammar induction influences either (i) the growth curve of grammars and lexicons or (ii) the similarity between representations learned from independent sub-sets of a corpus. A production based model is trained on the usage of a single individual, thus simulating the grammatical knowledge of a single speaker. A perception-based model is trained on an aggregation of many individuals, thus simulating grammatical generalizations learned from exposure to many different speakers. To ensure robustness, the experiments are replicated across two registers of written English, with four additional registers reserved as a control. A set of three computational experiments shows that production-based grammars are significantly different from perception-based grammars across all conditions, with a steeper growth curve that can be explained by substantial inter-individual grammatical differences.Item Open Access Copyright and Post-disaster Archiving(2019) Thomson C; Millar, PaulIn this workshop session Paul Millar delivers a presentation, jointly prepared with Dr Chris Thomson, which discusses the experience of the CEISMIC project in dealing with copyright issues. He outlines the relevant law as it stands and is applied in New Zealand, and discusses some of the unique situations the project encountered as the team tried to ensure openness, inclusivity and rigour within a copyright compliant framework during a period of trauma and transition.Item Open Access Item Open Access Representations of Language Varieties Are Reliable Given Corpus Similarity Measures(Association for Computational Linguistics, 2021) Dunn, JonathanThis paper measures similarity both within and between 84 language varieties across nine languages. These corpora are drawn from digital sources (the web and tweets), allowing us to evaluate whether such geo-referenced corpora are reliable for modelling linguistic variation. The basic idea is that, if each source adequately represents a single underlying language variety, then the similarity between these sources should be stable across all languages and countries. The paper shows that there is a consistent agreement between these sources using frequency-based corpus similarity measures. This provides further evidence that digital geo-referenced corpora consistently represent local language varieties.Item Open Access I the orator: strategies of self-presentation in mid Republican Rome(2020) Sciarrino EItem Open Access