Total: 1
Existing works on the expressive power of neural networks typically assume real-valued parameters and exact mathematical operations during the evaluation of networks. However, neural networks run on actual computers can take parameters only from a small subset of the reals and perform inexact mathematical operations with round-off errors and overflows. In this work, we study the expressive power of floating-point neural networks, i.e., networks with floating-point parameters and operations. We first observe that for floating-point neural networks to represent all functions from floating-point vectors to floating-point vectors, it is necessary to distinguish different inputs: the first layer of a network should be able to generate different outputs for different inputs. We also prove that such distinguishability is sufficient, along with mild conditions on activation functions. Our result shows that with practical activation functions, floating-point neural networks can represent floating-point functions from a wide domain to all finite or infinite floats. For example, the domain is all finite floats for Sigmoid and tanh, and it is all finite floats of magnitude less than 1/8 times the largest float for ReLU, ELU, SeLU, GELU, Swish, Mish, and sin.