Characterizing ResNet’s Universal Approximation Capability

Authors: Chenghao Liu, Enming Liang, Minghua Chen

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section, we provide function approximation results to numerically validate the theoretical results presented in Sec. 4. To emphasize the approximation error, we involve a sufficiently complex target function for the experiment. Specifically, we utilize the following set of functions (where ai, bi are parameters) to test the universal approximation capability of b-Res Net. ... The results are shown in Figure 2 and Table 4.
Researcher Affiliation Academia 1School of Data Science, City University of Hong Kong.
Pseudocode No The paper includes high-level steps for construction (Table 2) and detailed mathematical proofs, but no formal pseudocode or algorithm blocks.
Open Source Code No The paper does not provide any explicit statements about releasing open-source code or links to code repositories.
Open Datasets No Specifically, we utilize the following set of functions (where ai, bi are parameters) to test the universal approximation capability of b-Res Net. ... For each case of d = 100, 200, 300, we randomly selected 30 functions from the set for function approximation experiments.
Dataset Splits No We conduct uniform sampling with 1000 d samples and use 90% for training and 10% for testing, and then take the average loss.
Hardware Specification No The paper does not specify any hardware details like GPU/CPU models or specific compute resources used for the experiments.
Software Dependencies No We optimize the network parameters using Adam (Kingma & Ba, 2014) with a learning rate of 10 3. The paper mentions the Adam optimizer but does not specify its version or any other software dependencies with version numbers.
Experiment Setup Yes Specifically, for each case of d = 100, 200, 300, we randomly selected 30 functions from the set for function approximation experiments. ... We then compare b-Res Net with fully-connected (FC) NN for approximating each sampled function, with network structure as RN (d + 1, n, d/10) for n {10, 20, 40}, and NN (d + 1, d/10), respectively. ... We conduct uniform sampling with 1000 d samples and use 90% for training and 10% for testing, and then take the average loss. We optimize the network parameters using Adam (Kingma & Ba, 2014) with a learning rate of 10 3 and present the test performance over iteration.