Unsupervised Knowledge-based Word Sense Disambiguation: Exploration & Evaluation of Semantic Subgraphs
Degree GrantorUniversity of Canterbury
Degree NameDoctor of Philosophy
Hypothetically, if you were told: Apple uses the apple as its logo . You would immediately detect two different senses of the word apple , these being the company and the fruit respectively. Making this distinction is the formidable challenge of Word Sense Disambiguation (WSD), which is the subtask of many Natural Language Processing (NLP) applications. This thesis is a multi-branched investigation into WSD, that explores and evaluates unsupervised knowledge-based methods that exploit semantic subgraphs. The nature of research covered by this thesis can be broken down to: 1. Mining data from the encyclopedic resource Wikipedia, to visually prove the existence of context embedded in semantic subgraphs 2. Achieving disambiguation in order to merge concepts that originate from heterogeneous semantic graphs 3. Participation in international evaluations of WSD across a range of languages 4. Treating WSD as a classification task, that can be optimised through the iterative construction of semantic subgraphs The contributions of each chapter are ranged, but can be summarised by what has been produced, learnt, and raised throughout the thesis. Furthermore an API and several resources have been developed as a by-product of this research, all of which can be accessed by visiting the author’s home page at http://www.stevemanion.com. This should enable researchers to replicate the results achieved in this thesis and build on them if they wish.