2606.17013

Total: 1

#1 Exploding and vanishing gradients in deep neural networks: the effect of residual connections [PDF1] [Copy] [Kimi] [REL]

Author: Vivek S Borkar

The well known phenomenon of exploding and vanishing gradients in deep neural networks is analyzed using multiplicative ergodic theory. The effect of adding a residual connection is explained in this context. Specifically, a characterization of Liapunov exponents due to Furstenberg and Kifer is exploited in order to make a precise statement about the Liapunov spectrum and the effect of residual connections on it.

Subjects: Optimization and Control , Machine Learning

Publish: 2026-06-15 17:46:16 UTC