On Redundancy and Diversity in Cell-based Neural Architecture Search
Authors: Xingchen Wan, Binxin Ru, Pedro M Esperança, Zhenguo Li
ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this work, we conduct an empirical post-hoc analysis of architectures from the popular cellbased search spaces and find that the existing search spaces contain a high degree of redundancy |
| Researcher Affiliation | Collaboration | 1Machine Learning Research Group, University of Oxford 2Huawei Noah s Ark Lab, London 3Huawei Noah s Ark Lab, Hong Kong |
| Pseudocode | No | The paper does not contain any sections or figures explicitly labeled "Pseudocode" or "Algorithm". |
| Open Source Code | Yes | Code is available at https: //github.com/xingchenwan/cell-based-NAS-analysis. |
| Open Datasets | Yes | NB301 (Siems et al., 2020) which includes 50,000+ architecture performance pairs in the DARTS space |
| Dataset Splits | Yes | on the CIFAR-10 dataset using the standard train/val split |
| Hardware Specification | Yes | on a single NVIDIA Tesla V100 GPU |
| Software Dependencies | No | The paper lists training parameters and techniques like "Optimizer: SGD", "Cutout: True", and "Mixup: True", but does not specify versions for any software libraries or dependencies (e.g., PyTorch, TensorFlow, Python version). |
| Experiment Setup | Yes | Specifically, we train architectures obtained from stacking the cells 8 times (8-layer architectures) with initial channel count of 32 on the CIFAR-10 dataset using the standard train/val split, and we use the hyperparameters below on a single NVIDIA Tesla V100 GPU: Optimizer: SGD Initial learning rate: 0.025 Final learning rate: 1e-8 Learning rate schedule: cosine annealing Epochs: 100 Weight decay: 3e-4 Momentum: 0.9 Auxiliary tower: True Auxliary weight: 0.4 Cutout: True Cutout length: 16 Drop path probability: 0.2 Gradient clip: 5 Batch size: 96 Mixup: True Mixup alpha: 0.2 |