An investigation of statistical learning curves: do we always need big data?

Type of content
Theses / Dissertations
Publisher's DOI/URI
Thesis discipline
Statistics
Degree name
Master of Science
Publisher
University of Canterbury
Journal Title
Journal ISSN
Volume Title
Language
English
Date
2017
Authors
Li, Yang
Abstract

The rapid revolutionary rapid Big Data technology has attracted increasing attention and widely been used in many industries. It is not only benefiting our life dramatically, but also posing new challenges to us at the same time. In many situations, dealing with these big and complex data can extremely difficult. However, do we really always need big data?

This thesis attempted to investigate whether do we need a large dataset to build a model with acceptable accuracy, how the number of observations affect the performance of statistical predictive methods and use learning curves to describe this relationship. Some popular statis- tical learning methods were considered and applied on 3 large datasets. An efficient parallel coding strategy in R was also provided.

Description
Citation
Keywords
Ngā upoko tukutuku/Māori subject headings
ANZSRC fields of research
Rights
All Right Reserved