Extensible optical music recognition
Thesis DisciplineComputer Science
Degree GrantorUniversity of Canterbury
Degree NameDoctor of Philosophy
The aim of Optical Music Recognition (OMR) is to convert optically scanned pages of music into a versatile machine-readable format. Existing work has achieved this aim for restricted sets of music notation. Here we investigate the design of an extensible OMR system. Music notation is characterised by intricate features which prove too complex for current computer systems to recognise in a single step. A common methodology in OMR systems is to detect simple primitive shapes which are then assembled into the intricate musical features. However, developing a system capable of processing an extensible set of notation is problematic because there is no limit to the musical shapes that can occur. This thesis deals with the issue by combining a specially designed programming language for primitive detection, a user-configurable knowledge-base for primitive assembly, and an object oriented interface for musical semantics. In doing so, the design is capable of processing not only an extensible set of shapes within one notation, but a variety of notations, such as common music notation, plainsong notation, and tablature. The specially designed programming language eliminates the need for repetitive descriptions, and consequently the code is concise. Grammar rules in the knowledge-base provide a flexible medium in which the valid taxonomy of musical features can be expressed. Finally, the object oriented interface provides a mechanism that can be tailored to encode the semantics of a specific musical notation. Within this framework, the thesis investigates six important steps in the OMR process-staff detection, musical object location, image enhancement, primitive detection, primitive assembly, and musical semantics. Existing work is refined and new algorithms are developed where appropriate. The thesis concludes by comparing the performance of two OMR configurations aimed at reliable matching. Both take approximately 10 minutes to process an A4 page of music using a Digital Celebris GL 5133, with an overall accuracy rate that exceeds 96%.