How Much Over-parameterization Is Sufficient to Learn Deep ReLU Networks?
Authors: Zixiang Chen, Yuan Cao, Difan Zou, Quanquan Gu
ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we conduct some simple experiments to validate our theory. Since our paper mainly focuses on binary classification, we use a subset of the original CIFAR10 dataset (Krizhevsky et al., 2009), which only has two classes of images. |
| Researcher Affiliation | Academia | Department of Computer Science, University of California, Los Angles {chenzx19,yuancao,knowzou,qgu}@cs.ucla.edu |
| Pseudocode | Yes | Algorithm 1 Gradient descent with random initialization |
| Open Source Code | No | The paper does not contain any explicit statement about providing open-source code for the described methodology, nor does it provide a link to a code repository. |
| Open Datasets | Yes | we use a subset of the original CIFAR10 dataset (Krizhevsky et al., 2009) |
| Dataset Splits | No | The paper mentions using a subset of CIFAR10 for training and evaluating training error but does not specify any explicit training/validation/test dataset splits (e.g., percentages, counts, or predefined splits). |
| Hardware Specification | No | The paper does not provide any specific hardware details (e.g., GPU/CPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper does not provide specific ancillary software details with version numbers (e.g., library names with versions) needed to replicate the experiment. |
| Experiment Setup | No | The paper mentions training a '5-layer fully-connected Re LU network' and varying 'sample sizes' but does not provide specific hyperparameters such as learning rate, batch size, optimizer, or number of epochs for the experimental setup. |