Opponent-Shaping and Interference in General-Sum Games

This s a hybrid meeting. Please find the Teams link in the abstract. Pizza will be provided after the talk at 6pm.

Abstract:
In general-sum games, the interaction of self-interested learning agents commonly leads to collectively worst-case outcomes, such as defect-defect in the iterated prisoner’s dilemma (IPD). To overcome this, some methods, such as Learning with Opponent-Learning Awareness (LOLA), shape their opponents’ learning process. However, these methods are myopic since only a small number of steps can be anticipated, are asymmetric since they treat other agents as naive learners, and require the use of higher-order derivatives, which are calculated through white-box access to an opponent’s differentiable learning algorithm. In this talk I will first introduce Model-Free Opponent Shaping (M-FOS), which overcomes all of these limitations. M-FOS learns in a meta-game in which each meta-step is an episode of the underlying (``inner’‘) game. The meta-state consists of the inner policies, and the meta-policy produces a new inner policy to be used in the next episode. M-FOS then uses generic model-free optimisation methods to learn meta-policies that accomplish long-horizon opponent shaping. I will finish off the talk with our recent results for adversarial (or cooperative) cheap-talk: How can agents interfere with (or support) the learning process of other agents without being able to act in the environment?

Bio:
Jakob Foerster started as an Associate Professor at the department of engineering science at the University of Oxford in the fall of 2021. During his PhD at Oxford he helped bring deep multi-agent reinforcement learning to the forefront of AI research and interned at Google Brain, OpenAI, and DeepMind. After his PhD he worked as a research scientist at Facebook AI Research in California, where he continued doing foundational work. He was the lead organizer of the first Emergent Communication workshop at NeurIPS in 2017, which he has helped organize ever since and was awarded a prestigious CIFAR AI chair in 2019.

Teams link: teams.microsoft.com/l/meetup-join/19%3ameeting_ZDZiMzIwODMtNzVmZi00Y2U4LTliMGEtOTkyOWQ4YmIyYjQ1%40thread.v2/0?context=%7b%22Tid%22%3a%22cc95de1b-97f5-4f93-b4ba-fe68b852cf91%22%2c%22Oid%22%3a%222d6d82c4-6b2c-4f77-b979-7c49923c3b36%22%7d

Date: 7 March 2024, 17:00
Venue:
Wolfson College
Linton Road OX2 6UD
See location on maps.ox

Details: Levett Room
Speaker: Prof Jokob Forster (Oxford (FLAIR))
Organising department: Wolfson College
Organisers: Mr Csaba Botos (University of Oxford), Dr. Yi Yin (Wolfson College, University of Oxford)
Organiser contact email address: yi.yin@wrh.ox.ac.uk
Part of: Oxford Cross-Disciplinary Machine Learning (OxfordXML) Research Cluster Seminar Series
Booking required?: Not required
Cost: Free (cake, tea and coffee provided)
Audience: Public
Editor: Yi Yin