Shutdownable Agents through POST-Agency

2505.20203

Total: 1

#1 Shutdownable Agents through POST-Agency [PDF] [Copy] [Kimi²] [REL]

Many fear that future artificial agents will resist shutdown. I present an idea - the POST-Agents Proposal - for ensuring that doesn't happen. I propose that we train agents to satisfy Preferences Only Between Same-Length Trajectories (POST). I then prove that POST - together with other conditions - implies Neutrality+: the agent maximizes expected utility, ignoring the probability distribution over trajectory-lengths. I argue that Neutrality+ keeps agents shutdownable and allows them to be useful.

Subject: Artificial Intelligence

Publish: 2025-05-26 16:44:17 UTC

2505.20203

#1 Shutdownable Agents through POST-Agency [PDF] [Copy] [Kimi2] [REL]

#1 Shutdownable Agents through POST-Agency [PDF] [Copy] [Kimi²] [REL]