Code-Optimise: Self-Generated Preference Data for Correctness and Efficiency

#1 Code-Optimise: Self-Generated Preference Data for Correctness and Efficiency [PDF] [Copy] [Kimi] [REL]

Authors: Leonidas Gee, Milan Gritta, Gerasimos Lampouras, Ignacio Iacobacci

Code Language Models have been trained togenerate accurate solutions, typically with noregard for runtime. On the other hand, previousworks that explored execution optimisationhave observed corresponding drops infunctional correctness. To that end, we introduceCode-Optimise, a framework that incorporatesboth correctness (passed, failed) andruntime (quick, slow) as learning signals viaself-generated preference data. Our frameworkis both lightweight and robust as it dynamicallyselects solutions to reduce overfitting whileavoiding a reliance on larger models for learningsignals. Code-Optimise achieves significantimprovements in pass@k while decreasingthe competitive baseline runtimes by anadditional 6% for in-domain data and up to3% for out-of-domain data. As a by-product,the average length of the generated solutionsis reduced by up to 48% on MBPP and 23%on HumanEval, resulting in faster and cheaperinference. The generated data and codebaseis open-sourced at https://github.com/huawei-noah/HEBO/tree/Code_Optimise.

Subject: NAACL.2025 - Findings

2025.findings-naacl.5@ACL

#1 Code-Optimise: Self-Generated Preference Data for Correctness and Efficiency [PDF] [Copy] [Kimi] [REL]