How to learn useful representations in a structured world

Status: This talk is in preparation - details may change
Status: This talk has been cancelled

Meta-learning, the ability to learn the right structure or prior knowledge to facilitate new learning, is heavily reliant on structured data. Humans, deep RL agents, and even large language models (LLMs) are all capable of meta-learning. While recurrent neural network-based models can be linked to neural activations in biological organisms, understanding how LLMs perform this quick, in-context learning is more difficult. LLMs can be pre-trained on human-generated artifacts, such as the internet and books, which contain substantial structure and enable good generalization. However, the lack of specific knowledge of the training data makes it challenging to quantify their performance, especially as they’re being increasingly deployed at scale in the real world. New approaches have been introduced that allow us to more closely interrogate how they work, approaches directly taken from the cognitive sciences. In this talk I discuss how we can better understand both deep RL agents and LLMs by looking at structure within their training data through this lens, and why they’re so powerful.