OxTalks is Changing
OxTalks will soon move to the new Halo platform and will become 'Oxford Events.' There will be a need for an OxTalks freeze. This was previously planned for Friday 14th November – a new date will be shared as soon as it is available (full details will be available on the Staff Gateway).
In the meantime, the OxTalks site will remain active and events will continue to be published.
If staff have any questions about the Oxford Events launch, please contact halo@digital.ox.ac.uk
A theory of the dynamics of deep learning: Consequences for semantic development
Anatomically, the brain is deep; and computationally, deep learning is known to be hard. How might depth impact learning in the brain? To understand the specific ramifications of depth, I develop the theory of learning in deep linear neural networks. I give exact solutions to the dynamics of learning which specify how every weight in the network evolves over the course of training. The theory answers fundamental questions such as how learning speed scales with depth, and why unsupervised pretraining accelerates learning. Turning to generalization error, I use random matrix theory to analyze the cognitively-relevant “high-dimensional” regime, where the number of training examples is on the order or even less than the number of adjustable synapses. Consistent with the striking performance of very large deep network models in practice, I show that good generalization is possible in overcomplete networks due to implicit regularization in the dynamics of gradient descent. These results reveal a speed-accuracy trade-off between training speed and generalization performance in deep networks. Drawing on these findings, I then describe an application to human semantic development. From a stream of individual episodes, we abstract our knowledge of the world into categories and overarching structures like trees and hierarchies. I present an exactly solvable model of this process by considering how a deep network will learn about richly structured environments specified as probabilistic graphical models. This scheme illuminates empirical phenomena documented by developmental psychologists, such as transient illusory correlations and changing patterns of inductive generalization. Neurally, the theory suggests that the representation of complex structures resides in the similarity structure of neural population responses, not the detailed activity patterns of individual neurons. Overall, these results suggest that depth may be an important computational principle influencing learning in the brain. Deep linear networks yield a tractable theory of layered learning that interlinks computation, neural representations, and behavior.
Date:
11 December 2017, 13:30
Venue:
Le Gros Clark Building, off South Parks Road OX1 3QX
Venue Details:
Lecture theatre
Speaker:
Dr Andrew Saxe (Harvard)
Organiser contact email address:
friedemann.zenke@cncb.ox.ac.uk
Hosts:
Dr Friedemann Zenke (University of Oxford),
Dr Tim Vogels
Part of:
Oxford NeuroAI Forum
Booking required?:
Not required
Audience:
Members of the University only
Editor:
Friedemann Zenke