Towards Understanding Generalization via Decomposing Excess Risk Dynamics

Authors: Jiaye Teng, Jianhao Ma, Yang Yuan

ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on neural networks verify the utility of the decomposition framework. (...) We conduct several experiments on both synthetic datasets and real-world datasets (MINIST, CIFAR-10) to validate the utility of the decomposition framework
Researcher Affiliation Academia 1Institute for Interdisciplinary Information Sciences, Tsinghua University 2Department of Industrial and Operational Engineering, University of Michigan, Ann Arbor 3Shanghai Qi Zhi Institute
Pseudocode No No explicit pseudocode or algorithm blocks were found in the paper.
Open Source Code No No statements regarding the availability of open-source code for the described methodology were found in the paper.
Open Datasets Yes Experiments on neural networks on both synthetic and real-world datasets (MINIST, CIFAR-10)
Dataset Splits No The paper mentions using MNIST and CIFAR-10 datasets but does not explicitly provide specific training/validation/test splits (e.g., percentages or counts) or reference standard splits.
Hardware Specification No No specific hardware details (e.g., GPU/CPU models, memory specifications) used for running experiments were mentioned in the paper.
Software Dependencies No The paper mentions optimizers like SGD, Adam, and Rprop, and neural network types (Re LU, CNN), but does not provide specific software package names with version numbers (e.g., PyTorch 1.9, TensorFlow 2.x).
Experiment Setup Yes In this experiment, we fix the width of each layer to be 64, and use SGD as the optimizer with Gaussian initialization (N(0, σ2) where σ = 1 10 3) and stepsize η = 1 10 2. (...) Adam (stepsize η = 0.002, (β1, β2) = (0.9, 0.999), ϵ = 1e 08, no weight decay), Rprop (learning rate η = 5 10 4, (η1, η2) = (0.5, 1.2), stepsizes (1 10 6, 50)).