2601.13026

Total: 1

#1 Multi-gear bandits, partial conservation laws, and indexability [PDF] [Copy] [Kimi] [REL]

Author: José Niño-Mora

This paper considers what we propose to call multi-gear bandits, which are Markov decision processes modeling a generic dynamic and stochastic project fueled by a single resource and which admit multiple actions representing gears of operation naturally ordered by their increasing resource consumption. The optimal operation of a multi-gear bandit aims to strike a balance between project performance costs or rewards and resource usage costs, which depend on the resource price. A computationally convenient and intuitive optimal solution is available when such a model is indexable, meaning that its optimal policies are characterized by a dynamic allocation index (DAI), a function of state--action pairs representing critical resource prices. Motivated by the lack of general indexability conditions and efficient index-computing schemes, and focusing on the infinite-horizon finite-state and -action discounted case, we present a verification theorem ensuring that, if a model satisfies two proposed PCL-indexability conditions with respect to a postulated family of structured policies, then it is indexable and such policies are optimal, with its DAI being given by a marginal productivity index computed by a downshift adaptive-greedy algorithm in $A N$ steps, with $A+1$ actions and $N$ states. The DAI is further used as the basis of a new index policy for the multi-armed multi-gear bandit problem.

Subject: Optimization and Control

Publish: 2026-01-19 13:06:31 UTC