reproducibilityindex.ai

Benign Overfitting in Two-layer ReLU Convolutional Neural Networks

Authors: Yiwen Kou, Zixiang Chen, Yuanzhou Chen, Quanquan Gu

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our result also reveals a sharp transition between benign and harmful overfitting under different conditions on data distribution in terms of test risk. Experiments on synthetic data back up our theory.
Researcher Affiliation	Academia	1Department of Computer Science, University of California, Los Angeles. Correspondence to: Quanquan Gu <qgu@cs.ucla.edu>.
Pseudocode	No	I did not find any structured pseudocode or algorithm blocks in the paper.
Open Source Code	Yes	The code for our experiments can be found on Github 1. 1https://github.com/uclaml/Benign Re LU CNN
Open Datasets	No	Here we generate synthetic data exactly following Definition 1.1. Definition 1.1. Let µ Rd be a fixed vector representing the signal contained in each data point... is generated from a distribution D, which we specify as follows:... The paper defines a synthetic data generation process rather than using an existing public dataset with concrete access information.
Dataset Splits	No	I did not find specific information about validation dataset splits. The paper mentions "training data size n = 20" and "estimate the test error for each case using 1000 test data points."
Hardware Specification	No	I did not find any specific hardware details such as GPU or CPU models, or memory specifications. The paper only states general training parameters for the experiments.
Software Dependencies	No	We use the default initialization method in Py Torch to initialize the CNN parameters and train the CNN with full-batch gradient descent with a learning rate of 0.1 for 100 iterations. (PyTorch is mentioned, but no version number.)
Experiment Setup	Yes	The number of filters is set as m = 10. We use the default initialization method in Py Torch to initialize the CNN parameters and train the CNN with full-batch gradient descent with a learning rate of 0.1 for 100 iterations.