Non-exponential Reward Discounting in Reinforcement Learning

26916@AAAI

Total: 1

#1 Non-exponential Reward Discounting in Reinforcement Learning [PDF¹] [Copy] [Kimi] [REL]

Reinforcement learning methods typically discount future rewards using an exponential scheme to achieve theoretical convergence guarantees. Studies from neuroscience, psychology, and economics suggest that human and animal behavior is better captured by the hyperbolic discounting model. Hyperbolic discounting has recently been studied in deep reinforcement learning and has shown promising results. However, this area of research is seemingly understudied, with most extant and continuing research using the standard exponential discounting formulation. My dissertation examines the effects of non-exponential discounting functions (such as hyperbolic) on an agent's learning and aims to investigate their impact on multi-agent systems and generalization tasks. A key objective of this study is to link the discounting rate to an agent's approximation of the underlying hazard rate of its environment through survival analysis.

Subject: AAAI.2023 - Doctoral Consortium

26916@AAAI

#1 Non-exponential Reward Discounting in Reinforcement Learning [PDF1] [Copy] [Kimi] [REL]

#1 Non-exponential Reward Discounting in Reinforcement Learning [PDF¹] [Copy] [Kimi] [REL]