Processing math: 100%

Es4RPNDtmq@OpenReview

Total: 1

#1 Robust Weight Initialization for Tanh Neural Networks with Fixed Point Analysis [PDF1] [Copy] [Kimi1] [REL]

Authors: Hyunwoo Lee, Hayoung Choi, Hyunju Kim

As a neural network's depth increases, it can achieve high generalization performance. However, training deep networks is challenging due to gradient and signal propagation issues. To address these challenges, extensive theoretical research and various methods have been introduced. Despite these advances, effective weight initialization methods for tanh neural networks remain underexplored. This paper presents a novel weight initialization method for Neural Networks with tanh activation function. Based on an analysis of the fixed points of the function tanh(ax), our proposed method aims to determine values of a that mitigate activation saturations. A series of experiments on various classification datasets and Physics-Informed Neural Networks demonstrate that the proposed method outperforms Xavier initialization methods (with or without normalization) in terms of robustness to network size variations, data efficiency, and convergence speed.

Subject: ICLR.2025 - Poster