reproducibilityindex.ai

Saliency-Aware Neural Architecture Search

Authors: Ramtin Hosseini, Pengtao Xie

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments on several datasets demonstrate the effectiveness of our framework. In this section, we present experimental results.
Researcher Affiliation	Academia	Ramtin Hosseini and Pengtao Xie UC San Diego rhossein@eng.ucsd.edu, p1xie@eng.ucsd.edu
Pseudocode	No	The paper describes its method in prose, including a four-stage optimization framework and an optimization algorithm, but does not present any pseudocode or clearly labeled algorithm blocks.
Open Source Code	No	Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [No] The data is publicly available. The code is proprietary.
Open Datasets	Yes	Datasets We used three datasets: CIFAR-10 [35], CIFAR-100 [36], and Image Net [15].
Dataset Splits	Yes	For each of them, we split it into a train, validation, and test set with 25K, 25K, and 10K images respectively. Following [66], we randomly sample 10% of the 1.2M images to form a new training set and another 2.5% to form a validation set, then perform a search on them.
Hardware Specification	Yes	search cost (GPU days on a Nvidia 1080Ti). Did you include the total amount of compute and the type of resources used (e.g., type of GPUs, internal cluster, or cloud provider)? [Yes] They are included in the supplements.
Software Dependencies	No	The paper does not explicitly specify software dependencies with version numbers (e.g., specific libraries or frameworks like PyTorch or TensorFlow with their versions).
Experiment Setup	Yes	The tradeoff parameter γ is set to 2. The norm bound ε of perturbations is set to 0.03. ... with a batch size of 64, an initial learning rate of 0.025 with cosine scheduling, an epoch number of 50, a weight decay of 3e-4, and a momentum of 0.9. We optimize weight parameters using SGD. The initial learning rate is set to 2e 2. It is annealed using a cosine scheduler. The momentum is set to 0.9. We use Adam [34] to optimize the architecture variables. The learning rate is set to 3e 4 and weight decay is set to 1e 3.