reproducibilityindex.ai

Generalization Guarantees for Neural Architecture Search with Train-Validation Split

Authors: Samet Oymak, Mingchen Li, Mahdi Soltanolkotabi

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We provide two sets of experiments to verify our theory.
Researcher Affiliation	Academia	1Department of Electrical and Computer Eng., University of California, Riverside. 2Department of Computer Science and Eng., University of California, Riverside. 3Ming Hsieh Department of Electrical Eng., University of Southern California.
Pseudocode	No	The paper describes an algorithm with numbered steps, but it is not formatted as a pseudocode block or labeled as 'Algorithm'.
Open Source Code	No	No explicit statement or link to an open-source code repository for the described methodology is provided.
Open Datasets	Yes	a. Lipschitzness of Trained Networks. ... binary MNIST task... b. Test-Validation Gap for DARTS. ... DARTS algorithm (Liu et al., 2018) over CIFAR-10 dataset...
Dataset Splits	Yes	We evaluate train/test/validation losses of the continuously-parameterized supernet with h = 224 hyperparameters. We observe that training error is consistently zero after 30 epochs, whereas validation error almost perfectly tracks test error as soon as the validation size is mildly large (e.g. 250), which is consistent with Figures (a), (b) and our theory. [...] In Figures 3 and 1(c), we assess the gap between the test and validation errors while varying validation sizes from 20 to 1000.
Hardware Specification	No	The paper does not provide specific details about the hardware (e.g., GPU models, CPU types) used for running the experiments.
Software Dependencies	No	The paper mentions using 'SGD optimizer' but does not specify any software platforms (e.g., PyTorch, TensorFlow) or their version numbers.
Experiment Setup	Yes	We only consider the search phase of DARTS and train for 50 epochs using SGD. [...] Finally, we initialize the network with He initialization and train the model for 60 epochs with batch size 128 with SGD optimizer and learning rate 0.003.