An investigation of statistical learning curves: do we always need big data?

Li, Yang

An investigation of statistical learning curves: do we always need big data?

Files

Li, Yang_Master's Thesis.pdf (745.32 KB)

Type of content

Theses / Dissertations

UC permalink

http://hdl.handle.net/10092/13954
http://dx.doi.org/10.26021/3394

Thesis discipline

Statistics

Degree name

Master of Science

Publisher

University of Canterbury

Language

English

Date

2017

Authors

Li, Yang

Abstract

The rapid revolutionary rapid Big Data technology has attracted increasing attention and widely been used in many industries. It is not only benefiting our life dramatically, but also posing new challenges to us at the same time. In many situations, dealing with these big and complex data can extremely difficult. However, do we really always need big data?

This thesis attempted to investigate whether do we need a large dataset to build a model with acceptable accuracy, how the number of observations affect the performance of statistical predictive methods and use learning curves to describe this relationship. Some popular statis- tical learning methods were considered and applied on 3 large datasets. An efficient parallel coding strategy in R was also provided.

Rights

https://canterbury.libguides.com/rights/theses

Collections

Engineering: Theses and Dissertations

Full item page