Retrospective learning in the brain

A hallmark of intelligence is the ability to learn associations between causes and effects (e.g., environmental cues and associated rewards). The near consensus understanding of the last few decades is that animals learn cause-effect associations from errors in the prediction of the effect (e.g., a reward prediction error or RPE). This theory has been hugely influential in neuroscience as decades of evidence suggested that mesolimbic dopamine (DA)— known to be critical for associative learning—appears to signal RPE. My lab has recently provided an alternative theory (named ANCCR, read “anchor”) which postulates that animals learn associations by retrospectively identifying causes of meaningful effects such as rewards and that mesolimbic dopamine conveys that a current event is meaningful. The core idea is simple: you can learn to predict the future by retrodicting the past, and you retrodict the past only after meaningful events. Here, I will present the basic formulation of this theory, and present highly counterintuitive consequences of the theory in terms of learning rate dependence on reward frequency. I will then present unpublished experimental results testing these predictions and demonstrating that cue-reward learning rate is proportionally scaled by the duration between rewards across a variety of experimental settings.