reproducibilityindex.ai

Generalization Properties of NAS under Activation and Skip Connection Search

Authors: Zhenyu Zhu, Fanghui Liu, Grigorios Chrysos, Volkan Cevher

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	To validate our theoretical results, we conduct a series of experiments on NAS. Firstly, we simulate the NTK matrices under different depths in Appendix F.4 to verify the relationship between the minimum eigenvalue of NTK and the network depth L in Theorem 1. In sec. 5.1 we use the DARTS algorithm [Liu et al., 2019b] to conduct experiments on activation function search and skip connection search under the search space of Equation (1). Finally, we use the minimum eigenvalue of NTK to guide the training of NAS on the benchmark NAS-Bench-201 [Dong and Yang, 2020], with a comparison of recent NAS algorithms.
Researcher Affiliation	Academia	Zhenyu Zhu, Fanghui Liu, Grigorios G Chrysos, Volkan Cevher EPFL, Switzerland {[first name].[surname]}@epfl.ch
Pseudocode	Yes	Algorithm 1: SGD for training DNNs by NAS
Open Source Code	No	The code will be open-sourced upon the acceptance of the paper.
Open Datasets	Yes	We select Fashion-MNIST [Xiao et al., 2017] as a standard benchmark. NAS-Bench-201 [Dong and Yang, 2020] is a commonly used benchmark for NAS algorithm evaluation, which includes three datasets: a) CIFAR-10 [Krizhevsky et al., 2014], b) CIFAR-100 [Krizhevsky et al., 2014] and c) Image Net-16 [Chrabaszcz et al., 2017] for image classification.
Dataset Splits	Yes	Then, we conduct neural network training on the selected architecture by SGD. For ease of theoretical analysis, we employ the constant step-size SGD with one epoch and randomly choose the weight parameters during all the iterations, which is commonly used in deep learning theory [Cao and Gu, 2019, Zou et al., 2019]. Sequentially, the top-k best candidates architectures are chosen in KNAS and our Eigen-NAS, and then the best architecture is chosen by the validation error.
Hardware Specification	No	All our experiments are conducted on a single GPU in our internal cluster.
Software Dependencies	No	No specific software dependencies with version numbers (e.g., Python 3.8, PyTorch 1.9) are mentioned in the paper.
Experiment Setup	Yes	We conduct the experiment via DARTS on a feedforward neural network with L = 10 and m = 1024, with 5 runs. Gaussian initialization: W(1) l ∼ N(0, 1/m), l ∈ [L]. Input: search space S, data Dtr = {(xi, yi)N i=1}, step size γ and Flagmethod {Eigen NAS, DARTS, }.