Compress and Control: A generative approach to reinforcement learning

OxTalks is Changing

OxTalks will soon be transitioning to Oxford Events (full details are available on the Staff Gateway). A two-week publishing freeze is expected in early Hilary to allow all events to be migrated to the new platform. During this period, you will not be able to submit or edit events on OxTalks. The exact freeze dates will be confirmed as soon as possible.

If you have any questions, please contact halo@digital.ox.ac.uk

Compress and Control: A generative approach to reinforcement learning
In this talk we offer a generative perspective on value function approximation in reinforcement learning. Based on this perspective we develop the Compress and Control algorithm, which transforms arbitrary density estimators into value functions. In particular, we consider compression methods such as the Lempel-Ziv and Context Tree Switching algorithms as base models. The appeal of compression methods for density estimation is that they are in a sense feature-free: they can be tractably applied to bit sequences, and therefore to any kind of data. Along with a theoretical overview of the method, we present empirical results on the Atari 2600 platform.

Reference: webdocs.cs.ualberta.ca/~mg17/publications/veness14compress.pdf

—

Marc G. Bellemare received his Ph.D. from the University of Alberta, where he investigated the concept of domain-independent agents and led the design of the Arcade Learning Environment. His research interests include reinforcement learning, online learning, information theory, lifelong learning, and randomized algorithms. He is currently at Google DeepMind.

Joel Veness is a Senior Research Scientist at Google DeepMind. He is interested in reinforcement learning, universal source coding, Bayesian nonparametrics and game AI.
Date: 6 May 2015, 13:00
Venue: The Robert Hooke Building, Parks Road OX1 3PR
Venue Details: Tony Hoare Room, Department of Computer Science, Robert Hooke Building
Speakers: Marc Bellemare (Google DeepMind), Joel Veness (Google DeepMind)
Part of: Machine Learning Lunches
Booking required?: Not required
Audience: Public
Editor: Iurii Perov