Incomplete Preferences and Dynamic Consistency: Theory and Applications to AI Safety

OxTalks is Changing

OxTalks will soon be transitioning to Oxford Events (full details are available on the Staff Gateway). A two-week publishing freeze is expected in early Hilary to allow all events to be migrated to the new platform. During this period, you will not be able to submit or edit events on OxTalks. The exact freeze dates will be confirmed as soon as possible.

If you have any questions, please contact halo@digital.ox.ac.uk

Incomplete Preferences and Dynamic Consistency: Theory and Applications to AI Safety
Please note change of date and location.
The AI shutdown problem is, roughly, the problem of getting future advanced (agentic) AI systems to shut down when and only when we want them to. Thornley (2023) proved some theorems showing that an agent satisfying some rather minimal rationality criteria precludes it from being both capable and shutdownable. However, by denying the completeness axiom of utility theory, he manages to characterise a capable and shutdownable agent. A seemingly relevant condition for an agent to be capably goal-directed is that it avoids sequences of actions that foreseeably leave it worse off. My project proposes a choice rule which, as I derive, guarantees the dynamic consistency of agents with incomplete preferences.
Date: 29 November 2023, 15:00
Venue: Manor Road Building, Manor Road OX1 3UQ
Venue Details: Seminar Room B or https://zoom.us/j/97316381707?pwd=WDdBVlJVVU1Ga3JnRW15c0RnNlE5QT09
Speaker: Sami Petersen (University of Oxford)
Organising department: Department of Economics
Part of: Novel Ideas: MPhil Seminar Series
Booking required?: Not required
Audience: Members of the University only
Editor: Shreyasi Banerjee