Species classification on thermal video using a convolutional recurrent neural network. (2021)
Type of ContentTheses / Dissertations
Thesis DisciplineComputer Science
Degree NameMaster of Science
PublisherUniversity of Canterbury
This paper proposes a new approach to species surveying, utilising convolutional recurrent neural networks (CRNNs). By using breakthroughs in neural network architectures and designs, as well as modern hardware, new approaches are possible that have not yet been investigated. Analysing thousands of hours of footage allows for more accurate, timely, and interesting surveying footage, far surpassing current approaches used by conservation programs. Prior to this research, a reliable dataset of thermal images did not exist, much less a dataset that records motion. Further, the data has been labelled, and categorised by location and time. While the creation of this dataset alone is a contribution, the CRNN has a high performance and reliable detection for all trained classes, which increases as more data is gathered. This puts this neural network approach ahead of any other extant method, as those that do exist either use static images, infrared illumination, or perform worse.
The proposed approach is much better at detecting animals than current low tech trap or observation based approaches (by over 3 thousand times), such as trapping lines, transects, dog hunting, or observations. Further, it is more accurate than extant trail cameras for detecting small mammals - being about 10-50 times better in experimental trials.
Furthermore the net itself performs well on trained classes, with the accuracy of the CRNN reaching up to 87 percent and the catchment includes all night hours (the definition of which can be increased or decreased based on latitude and time of year, or simply ambient light levels) - and the filming technique uses a thermographic passive infrared camera, and requires a cold background. Processing time (per occurrence) is unaffected by total footage (3ms processing time per animal-occurrence), though obviously the more footage captured, the more that needs to be processed, also in- creasing linearly. Finally, the approach described in this paper has the potential to be used internationally, on all continents and environments, limited only by the anno- tated dataset size and quality on which it is trained, on all animals over a certain size, whether those animals interact, are delicate/easily damaged, or rare. While not being proposed as a replacement for all of the existing manual quantification tools, it that been shown to be a successful and useful addition added to the toolkits of conservation efforts.
RightsAll Right Reserved
Showing items related by title, author, creator and subject.
Shangguan, Huyuan (University of Canterbury, 2019)This thesis proposes, develops and evaluates different convolutional neural network based methods for 3D single-person pose estimation in RGB video. The research goals are achieved by studying image processing methods that ...
Kirkland, John Robert (University of Canterbury. Electrical and Electronic Engineering, 1995)As a step towards the development of a modular time-delay neural network (TDNN) for recognizing phonemes realized with a New Zealand English accent, this thesis focuses on the development of an expert module for closing ...
Orr, Ewan (University of Canterbury. Department of Physics and Astronomy, 2010)Our project uses ideas first presented by Alan Turing. Turing's immense contribution to mathematics and computer science is widely known, but his pioneering work in artificial intelligence is relatively unknown. In the ...