48eb2c79643df5cb7d125945238bd7d0@2023@MLSYS

Total: 1

#1 Virtual Machine Allocation with Lifetime Predictions [PDF] [Copy] [Kimi1] [REL]

Authors: Hugo Barbalho ; Patricia Kovaleski ; Beibin Li ; Luke Marshall ; Marco Molinaro ; Abhisek Pan ; Eli Cortez ; Matheus Leao ; Harsh Patwari ; Zuzu Tang ; Larissa Rozales Gonçalves ; David Dion ; Thomas Moscibroda ; Ishai Menache

The emergence of machine learning technology has motivated the use of ML-based predictors in computer systems to improve their efficiency and robustness. However, there are still numerous algorithmic and systems challenges in effectively utilizing ML models in large-scale resource management services that require high throughput and response latency of milliseconds. In this paper, we describe the design and implementation of a VM allocation service that uses ML predictions of the VM lifetime to improve packing efficiencies. We design lifetime-aware placement algorithms that are provably robust to prediction errors and demonstrate their merits in extensive real-trace simulations. We significantly upgraded the VM allocation infrastructure of Microsoft Azure to support such algorithms that require ML inference in the critical path. A robust version of our algorithms has been recently deployed in production, and obtains efficiency improvements expected from simulations.