2602.04910

Total: 1

#1 Reducing the Costs of Proof Synthesis on Rust Systems by Scaling Up a Seed Training Set [PDF1] [Copy] [Kimi1] [REL]

Authors: Nongyu Di, Tianyu Chen, Shan Lu, Shuai Lu, Yeyun Gong, Peng Cheng, Jacob R. Lorch, Yuan Yao, Xiaoxing Ma

Large Language Models (LLMs) are widely used for code generation. However, the correctness of code generated by LLMs remains a concern. A potential remedy to this concern is to have LLMs generate formal correctness proofs along with such code. However, compared with code generation, code-proof generation requires much higher reasoning capability and has much less existing data to learn from. In this paper, we present VeruSyn, a data synthesis pipeline for Verus, a state-of-the-art verification tool for system software written in Rust. Through self-synthesis and tutorial-based synthesis, VeruSyn achieves much larger scale and Verus-feature coverage than previous data-synthesis techniques designed for Verus; VeruSyn also supplements its dataset with long-chain-of-thought (CoT) data through agent trajectory synthesis. With VeruSyn, we synthesize the largest set of Verus verified programs: 6.9 million Rust programs, each with a formal specification and a proof that it meets that specification. This dataset lets us create a fine-tuned Qwen2.5-Coder-32B-Instruct model with appealing cost-proof tradeoff compared with state-of-the-art commercial models like Claude Sonnet 4.5. It also significantly outperforms models like o4-mini and previously proposed research models.

Subject: Software Engineering

Publish: 2026-02-04 01:04:56 UTC