Evolutionary reinforcement learning for vision-based general video game playing.

Tupper, Adam

Evolutionary reinforcement learning for vision-based general video game playing.

Files

Tupper, Adam_Final Master's Thesis.pdf (3.93 MB)

Type of content

Theses / Dissertations

UC permalink

https://hdl.handle.net/10092/101134
http://dx.doi.org/10.26021/10198

Thesis discipline

Computer Science

Degree name

Master of Science

Publisher

University of Canterbury

Language

English

Date

2020

Authors

Tupper, Adam

Abstract

Over the past decade, video games have become increasingly utilised for research in artificial intelligence. Perhaps the most extensive use of video games has been as benchmark problems in the field of reinforcement learning. Part of the reason for this is because video games are designed to challenge humans, and as a result, developing methods capable of mastering them is considered a stepping stone to achieving human-level per- formance in real-world tasks. Of particular interest are vision-based general video game playing (GVGP) methods. These are methods that learn from pixel inputs and can be applied, without modification, across sets of games. One of the challenges in evolutionary computing is scaling up neuroevolution methods, which have proven effective at solving simpler reinforcement learning problems in the past, to tasks with high- dimensional input spaces, such as video games. This thesis proposes a novel method for vision-based GVGP that combines the representational learning power of deep neural networks and the policy learning benefits of neuroevolution. This is achieved by separating state representation and policy learning and applying neuroevolution only to the latter. The method, AutoEncoder-augmented NeuroEvolution of Augmented Topologies (AE-NEAT), uses a deep autoencoder to learn compact state representations that are used as input for policy networks evolved using NEAT. Experiments on a selection of Atari games showed that this approach can successfully evolve high-performing agents and scale neuroevolution methods that evolve both weights and topology to do- mains with high-dimensional inputs. Overall, the experiments and results demonstrate a proof-of-concept of this separated state representation and policy learning approach and show that hybrid deep learning and neuroevolution-based GVGP methods are a promising avenue for future research.

Rights

https://canterbury.libguides.com/rights/theses

Collections

Engineering: Theses and Dissertations

Full item page