Species classification on thermal video using a convolutional recurrent neural network.

Carr, Christopher David

Species classification on thermal video using a convolutional recurrent neural network.

Files

Carr, Christopher_final Master's thesis.pdf (4.34 MB)

Type of content

Theses / Dissertations

UC permalink

https://hdl.handle.net/10092/102893
http://dx.doi.org/10.26021/12027

Thesis discipline

Computer Science

Degree name

Master of Science

Publisher

University of Canterbury

Language

English

Date

2021

Authors

Carr, Christopher David

Abstract

This paper proposes a new approach to species surveying, utilising convolutional recurrent neural networks (CRNNs). By using breakthroughs in neural network architectures and designs, as well as modern hardware, new approaches are possible that have not yet been investigated. Analysing thousands of hours of footage allows for more accurate, timely, and interesting surveying footage, far surpassing current approaches used by conservation programs. Prior to this research, a reliable dataset of thermal images did not exist, much less a dataset that records motion. Further, the data has been labelled, and categorised by location and time. While the creation of this dataset alone is a contribution, the CRNN has a high performance and reliable detection for all trained classes, which increases as more data is gathered. This puts this neural network approach ahead of any other extant method, as those that do exist either use static images, infrared illumination, or perform worse.

The proposed approach is much better at detecting animals than current low tech trap or observation based approaches (by over 3 thousand times), such as trapping lines, transects, dog hunting, or observations. Further, it is more accurate than extant trail cameras for detecting small mammals - being about 10-50 times better in experimental trials.

Furthermore the net itself performs well on trained classes, with the accuracy of the CRNN reaching up to 87 percent and the catchment includes all night hours (the definition of which can be increased or decreased based on latitude and time of year, or simply ambient light levels) - and the filming technique uses a thermographic passive infrared camera, and requires a cold background. Processing time (per occurrence) is unaffected by total footage (3ms processing time per animal-occurrence), though obviously the more footage captured, the more that needs to be processed, also in- creasing linearly. Finally, the approach described in this paper has the potential to be used internationally, on all continents and environments, limited only by the anno- tated dataset size and quality on which it is trained, on all animals over a certain size, whether those animals interact, are delicate/easily damaged, or rare. While not being proposed as a replacement for all of the existing manual quantification tools, it that been shown to be a successful and useful addition added to the toolkits of conservation efforts.

Rights

https://canterbury.libguides.com/rights/theses

Collections

Engineering: Theses and Dissertations

Full item page