2508.05207

Total: 1

#1 SpectroStream: A Versatile Neural Codec for General Audio [PDF2] [Copy] [Kimi] [REL]

Authors: Yunpeng Li, Kehang Han, Brian McWilliams, Zalan Borsos, Marco Tagliasacchi

We propose SpectroStream, a full-band multi-channel neural audio codec. Successor to the well-established SoundStream, SpectroStream extends its capability beyond 24 kHz monophonic audio and enables high-quality reconstruction of 48 kHz stereo music at bit rates of 4--16 kbps. This is accomplished with a new neural architecture that leverages audio representation in the time-frequency domain, which leads to better audio quality especially at higher sample rate. The model also uses a delayed-fusion strategy to handle multi-channel audio, which is crucial in balancing per-channel acoustic quality and cross-channel phase consistency.

Subjects: Sound , Artificial Intelligence , Audio and Speech Processing

Publish: 2025-08-07 09:44:00 UTC