Impact of Markov Decision Process Design on Sim-to-Real Reinforcement Learning

#1 Impact of Markov Decision Process Design on Sim-to-Real Reinforcement Learning [PDF¹] [Copy] [Kimi] [REL]

Authors: Tatjana Krau, Jorge Mandlmaier, Tobias Damm, Frieder Heieck

Reinforcement Learning (RL) has demonstrated strong potential for industrial process control, yet policies trained in simulation often suffer from a significant sim-to-real gap when deployed on physical hardware. This work systematically analyzes how core Markov Decision Process (MDP) design choices -- state composition, target inclusion, reward formulation, termination criteria, and environment dynamics models -- affect this transfer. Using a color mixing task, we evaluate different MDP configurations and mixing dynamics across simulation and real-world experiments. We validate our findings on physical hardware, demonstrating that physics-based dynamics models achieve up to 50% real-world success under strict precision constraints where simplified models fail entirely. Our results provide practical MDP design guidelines for deploying RL in industrial process control.

Subject: Machine Learning

Publish: 2026-03-10 09:41:37 UTC

2603.09427

#1 Impact of Markov Decision Process Design on Sim-to-Real Reinforcement Learning [PDF1] [Copy] [Kimi] [REL]

#1 Impact of Markov Decision Process Design on Sim-to-Real Reinforcement Learning [PDF¹] [Copy] [Kimi] [REL]