Total: 1
Frontier large language models are increasingly powerful though many of them are trained from vast proprietary data and intensive computes, raising barriers for academic labs and smaller institutions for exploration and improvement. In this talk, I will present a unified research agenda for breaking the resource monopoly in both post-training and serving. On the training side, I will describe label-free and even zero-data post-training pipelines that let models curate their own reasoning supervision. On the serving side, I will show how cost-aware inference can enable adaptive test-time scaling to be more efficient. Together, these components form a practical LLM system using modest data and compute resources.