Hierarchical Reinforcement Learning with Topology-Aware Exploration Framework for Multi-path Commodity Flow Problem

#1 Hierarchical Reinforcement Learning with Topology-Aware Exploration Framework for Multi-path Commodity Flow Problem [PDF] [Copy] [Kimi] [REL]

Authors: Jingchen Jiang, Xuan Zhou, Jiayuan Li, Geng Han, Xiang Shi, Fang Deng

The multi-path commodity flow problem (MPCFP) is crucial for ensuring reliable and high-speed data transmission in communication networks. However, existing studies that employ pre-generated routing paths neglect real-time load state and the coupling among decisions, thus hindering the achievement of high-quality solutions. To overcome this, we propose Hierarchical Reinforcement Learning with Topology-Aware Exploration (HRL-TAE), which is the first fully end-to-end framework that dynamically produces high-quality solutions based on real-time network states. HRL-TAE integrates an exploration mechanism and utilizes the State Transition Guiding List (STGL) to guide state transitions, thereby transforming topology exploration into a Markov decision process. Guided by STGL, two closely coupled layers in HRL-TAE, that is, the path construct layer and the ratio allocate layer, construct multiple subpaths for each flow and allocate traffic ratios among them. Subsequently, adaptive constraint-driven masks exclude infeasible actions during decision making, thereby guaranteeing that all constraints are satisfied. We also adopt a tailored training approach to obtain accurate gradient estimates and improve training efficiency. Simulations and real-world experiments demonstrate that HRL-TAE achieves superior performance.

Subject: AAAI.2026 - Planning, Routing, and Scheduling

40947@AAAI

#1 Hierarchical Reinforcement Learning with Topology-Aware Exploration Framework for Multi-path Commodity Flow Problem [PDF] [Copy] [Kimi] [REL]