Total: 1
In this paper, we investigate the universal approximation property of deep, narrow multilayer perceptrons (MLPs) for $C^1$ functions under the Sobolev norm, specifically the $W^{1, \infty}$ norm. Although the optimal width of deep, narrow MLPs for approximating continuous functions has been extensively studied, significantly less is known about the corresponding optimal width for $C^1$ functions. We demonstrate that \textit{the optimal width} can be determined in a wide range of cases within the $C^1$ setting. Our approach consists of two main steps. First, leveraging control theory, we show that any diffeomorphism can be approximated by deep, narrow MLPs. Second, using the Borsuk-Ulam theorem and various results from differential geometry, we prove that the optimal width for approximating arbitrary $C^1$ functions via diffeomorphisms is $\min(n + m, \max(2n + 1, m))$ in certain cases, including $(n,m) = (8,8)$ and $(16,8)$, where $n$ and $m$ denote the input and output dimensions, respectively. Our results apply to a broad class of activation functions.