DARTS-: Robustly Stepping out of Performance Collapse Without Indicators
Authors: Xiangxiang Chu, Xiaoxing Wang, Bo Zhang, Shun Lu, Xiaolin Wei, Junchi Yan
ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on various datasets verify that it can substantially improve robustness. |
| Researcher Affiliation | Collaboration | 1Meituan, 2Shanghai Jiao Tong University, 3University of Chinese Academy of Sciences |
| Pseudocode | Yes | Algorithm 1 DARTS |
| Open Source Code | Yes | Our code is available at https://github.com/Meituan-Auto ML/DARTS-. |
| Open Datasets | Yes | Extensive experiments on various datasets verify that it can substantially improve robustness. Our code is available at https://github.com/Meituan-Auto ML/DARTS-. ... We conduct thorough experiments across seven search spaces and three datasets to demonstrate the effectiveness of our method. ... CIFAR-10 and CIFAR-100. ... Image Net. ... NAS-Bench-201 (Dong & Yang, 2020) |
| Dataset Splits | Yes | min α Lval(w (α), α) s.t. w (α) = arg min w Ltrain(w, α) |
| Hardware Specification | Yes | It takes about 4.5 GPU days on Tesla V100. |
| Software Dependencies | No | The paper mentions using SGD and Adam optimizers and general settings for training (e.g., learning rate, batch size) but does not specify software versions for libraries like PyTorch, TensorFlow, or CUDA. |
| Experiment Setup | Yes | We use the SGD optimizer for weight and Adam (β1 = 0.5 and β2 = 0.999, 0.001 learning rate) for architecture parameters with the batch-size of 768. The initial learning rate is 0.045 and decayed to 0 within 30 epochs following the cosine schedule. We also use L2 regularization with 1e-4. |