Co-Activation Graph Analysis of Safety-Verified and Explainable Deep Reinforcement Learning Policies

2501.03142

Total: 1

#1 Co-Activation Graph Analysis of Safety-Verified and Explainable Deep Reinforcement Learning Policies [PDF¹] [Copy] [Kimi¹] [REL]

Authors: Dennis Gross, Helge Spieker

Deep reinforcement learning (RL) policies can demonstrate unsafe behaviors and are challenging to interpret. To address these challenges, we combine RL policy model checking--a technique for determining whether RL policies exhibit unsafe behaviors--with co-activation graph analysis--a method that maps neural network inner workings by analyzing neuron activation patterns--to gain insight into the safe RL policy's sequential decision-making. This combination lets us interpret the RL policy's inner workings for safe decision-making. We demonstrate its applicability in various experiments.

Subjects: Artificial Intelligence , Machine Learning

Publish: 2025-01-06 17:07:44 UTC

2501.03142

#1 Co-Activation Graph Analysis of Safety-Verified and Explainable Deep Reinforcement Learning Policies [PDF1] [Copy] [Kimi1] [REL]

#1 Co-Activation Graph Analysis of Safety-Verified and Explainable Deep Reinforcement Learning Policies [PDF¹] [Copy] [Kimi¹] [REL]