A Kubernetes custom scheduler based on reinforcement learning for compute-intensive pods

#1 A Kubernetes custom scheduler based on reinforcement learning for compute-intensive pods [PDF] [Copy] [Kimi¹] [REL]

Authors: Hanlin Zhou, Huah Yong Chan, Shun Yao Zhang, Meie Lin, Jingfei Ni

With the rise of cloud computing and lightweight containers, Docker has emerged as a leading technology for rapid service deployment, with Kubernetes responsible for pod orchestration. However, for compute-intensive workloads-particularly web services executing containerized machine-learning training-the default Kubernetes scheduler does not always achieve optimal placement. To address this, we propose two custom, reinforcement-learning-based schedulers, SDQN and SDQN-n, both built on the Deep Q-Network (DQN) framework. In compute-intensive scenarios, these models outperform the default Kubernetes scheduler as well as Transformer-and LSTM-based alternatives, reducing average CPU utilization per cluster node by 10%, and by over 20% when using SDQN-n. Moreover, our results show that SDQN-n approach of consolidating pods onto fewer nodes further amplifies resource savings and helps advance greener, more energy-efficient data centers.Therefore, pod scheduling must employ different strategies tailored to each scenario in order to achieve better performance.Since the reinforcement-learning components of the SDQN and SDQN-n architectures proposed in this paper can be easily tuned by adjusting their parameters, they can accommodate the requirements of various future scenarios.

Subject: Distributed, Parallel, and Cluster Computing

Publish: 2026-01-20 04:06:24 UTC

2601.13579

#1 A Kubernetes custom scheduler based on reinforcement learning for compute-intensive pods [PDF] [Copy] [Kimi1] [REL]

#1 A Kubernetes custom scheduler based on reinforcement learning for compute-intensive pods [PDF] [Copy] [Kimi¹] [REL]