Integrating new knowledge without catastrophic interference: Computational and theoretical investigations in a hierarchically structured environment

Integrating new knowledge without catastrophic interference: Computational and theoretical investigations in a hierarchically structured environment
According to complementary learning systems theory, integrating new memories into a multi-layer neural network without interfering with what is already known depends on interleaving presentation of the new memories with ongoing presentations of items previously learned. I use deep linear neural networks in hierarchically structured environments previously analyzed by Saxe, McClelland, and Ganguli (SMG) to gain new insights into this process. For the environment I will consider in this talk, its content can be described by the singular value decomposition (SVD) of the environment’s input-output covariance matrix, in which each successive dimension corresponds to categorical split in the hierarchical environment. Prior work showed that deep linear networks are sufficient to learn the content of the environment, and they do so in a stage-line way, with each dimension strength rising from near-zero to its maximum strength after a delay inversely proportional to the strength of the dimension, as previously demonstrated by Saxe et al. Several observations are then accessible when we consider learning a new item previously not encountered in the micro-environment. (1) The item can be examined in terms of its projection onto the existing structure, and whether it adds a new categorical split. (2) To the extent the item projects onto existing structure, including it in the training corpus leads to the rapid adjustment of the representation of the categories involved, and effectively no adjustment occurs to categories onto which the new item does not project at all. (3) Learning a new split is slow, and its learning dynamics show the same delayed rise to maximum that depends on the dimension’s strength. These observations them motivate the development of a similarity-weighted interleaved learning scheme in which only items similar to the to-be-learned new item need be presented to avoid catastrophic interference.
Date: 22 August 2018, 14:00
Venue: Le Gros Clark Building, off South Parks Road OX1 3QX
Venue Details: Lecture Theatre
Speaker: Prof James L. McClelland (Stanford University)
Organiser: Dr Friedemann Zenke (University of Oxford)
Organiser contact email address: friedemann.zenke@cncb.ox.ac.uk
Host: Dr Friedemann Zenke (University of Oxford)
Part of: Oxford Neurotheory Forum
Booking required?: Not required
Audience: Members of the University only
Editor: Friedemann Zenke