nHNYDM6PVz@OpenReview

Total: 1

#1 NFIG: Multi-Scale Autoregressive Image Generation via Frequency Ordering [PDF] [Copy] [Kimi] [REL]

Authors: Zhihao Huang, Xi Qiu, Yukuo Ma, Yifu Zhou, Junjie Chen, Hongyuan Zhang, Chi Zhang, Xuelong Li

Autoregressive models have achieved significant success in image generation. However, unlike the inherent hierarchical structure of image information in the spectral domain, standard autoregressive methods typically generate pixels sequentially in a fixed spatial order. To better leverage this spectral hierarchy, we introduce Next-Frequency Image Generation (NFIG). NFIG is a novel framework that decomposes the image generation process into multiple frequency-guided stages. NFIG aligns the generation process with the natural image structure. It does this by first generating low-frequency components, which efficiently capture global structure with significantly fewer tokens, and then progressively adding higher-frequency details. This frequency-aware paradigm offers substantial advantages: it not only improves the quality of generated images but crucially reduces inference cost by efficiently establishing global structure early on. Extensive experiments on the ImageNet-256 benchmark validate NFIG's effectiveness, demonstrating superior performance (FID: 2.81) and a notable 1.25x speedup compared to the strong baseline VAR-d20.

Subject: NeurIPS.2025 - Poster