Time-Masked Transformers with Lightweight Test-Time Adaptation for Neural Speech Decoding

#1 Time-Masked Transformers with Lightweight Test-Time Adaptation for Neural Speech Decoding [PDF] [Copy] [Kimi] [REL]

Authors: Ebrahim Feghhi, Shreyas Kaasyap, Nima Ryan Hadidi, Jonathan Kao

Speech neuroprostheses aim to restore communication for people with severe paralysis by decoding speech directly from neural activity. To accelerate algorithmic progress, a recent benchmark released intracranial recordings from a paralyzed participant attempting to speak, along with a baseline decoding algorithm. Prior work on the benchmark showed impressive accuracy gains. However, these gains increased computational costs and were not demonstrated in a real-time decoding setting. Here, we make three contributions that pave the way towards accurate, efficient, and real-time neural speech decoding. First, we incorporate large amounts of time-masking during training. On average, over $50\%$ of each trial is masked. Second, we replace the gated recurrent unit (GRU) architecture used in the baseline algorithm with a compact Transformer. The Transformer architecture uses $83\%$ fewer parameters, cuts peak GPU memory usage by $52\%$ relative, and is significantly faster to calibrate relative to the GRU. Third, we design a lightweight variant of an existing test-time adaptation method developed for decoding handwriting from neural activity. Our variant adapts the model using multiple time-masked augmentations of a single trial and requires only one gradient step per trial. Together, these contributions reduce word error rate by $20\%$ and effectively mitigate performance degradations across held-out days in a real-time decoding setting while substantially lowering computational costs.

Subject: NeurIPS.2025 - Poster

0U7D9AFiZ0@OpenReview

#1 Time-Masked Transformers with Lightweight Test-Time Adaptation for Neural Speech Decoding [PDF] [Copy] [Kimi] [REL]