Performance

2025-03-28 | | Total: 2

#1 Cloud Resource Allocation with Convex Optimization [PDF] [Copy] [Kimi] [REL]

Authors: Shayan Boghani, Emin Kirimlioglu, Amrita Moturi, Hao-Ting Tso

We present a convex optimization framework for overcoming the limitations of Kubernetes Cluster Autoscaler by intelligently allocating diverse cloud resources while minimizing costs and fragmentation. Current Kubernetes scaling mechanisms are restricted to homogeneous scaling of existing node types, limiting cost-performance optimization possibilities. Our matrix-based model captures resource demands, costs, and capacity constraints in a unified mathematical framework. A key contribution is our logarithmic approximation to the indicator function, which enables dynamic node type selection while maintaining problem convexity. Our approach balances cost optimization with operational complexity through interior-point methods. Experiments with real-world Kubernetes workloads demonstrate reduced costs and improved resource utilization compared to conventional Cluster Autoscaler strategies that can only scale up or down existing node pools.

Subjects: Distributed, Parallel, and Cluster Computing , Performance

Publish: 2025-03-27 02:29:55 UTC


#2 Harnessing Chain-of-Thought Metadata for Task Routing and Adversarial Prompt Detection [PDF] [Copy] [Kimi] [REL]

Authors: Ryan Marinelli, Josef Pichlmeier, Tamas Bisztray

In this work, we propose a metric called Number of Thoughts (NofT) to determine the difficulty of tasks pre-prompting and support Large Language Models (LLMs) in production contexts. By setting thresholds based on the number of thoughts, this metric can discern the difficulty of prompts and support more effective prompt routing. A 2% decrease in latency is achieved when routing prompts from the MathInstruct dataset through quantized, distilled versions of Deepseek with 1.7 billion, 7 billion, and 14 billion parameters. Moreover, this metric can be used to detect adversarial prompts used in prompt injection attacks with high efficacy. The Number of Thoughts can inform a classifier that achieves 95% accuracy in adversarial prompt detection. Our experiments ad datasets used are available on our GitHub page: https://github.com/rymarinelli/Number_Of_Thoughts/tree/main.

Subjects: Computation and Language , Artificial Intelligence , Performance

Publish: 2025-03-27 12:54:00 UTC