Learning from Mistakes – a Framework for Neural Architecture Search
Authors: Bhanu Garg, Li Zhang, Pradyumna Sridhara, Ramtin Hosseini, Eric Xing, Pengtao Xie10184-10192
AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The experiments are performed on three popular NAS datasets, namely CIFAR-10, CIFAR-100 (Krizhevsky, Hinton et al. 2009) and Image Net (Deng et al. 2009). Experimental results strongly demonstrate the effectiveness of our model. The results of the classification error(%) of different NAS methods on CIFAR-100 are showed in Table 1. Ablation Studies |
| Researcher Affiliation | Academia | 1 University of California, San Diego, USA 2 Zhejiang University, China 3 Carnegie Mellon University, USA 4 Mohamed bin Zayed University of Artificial Intelligence, UAE |
| Pseudocode | Yes | The overall algorithm of LFM when applying to other related methods can be summarised in Algorithm 1. And LFM can be applied to any differentiable NAS methods. Algorithm 1: Algorithm for LFM 1: while not converged do 2: Update W1 3: Update W2 4: Update A, V, r respectively 5: end while |
| Open Source Code | No | The paper does not provide an explicit statement about releasing code for its methodology or a direct link to a source-code repository. |
| Open Datasets | Yes | The experiments are performed on three popular NAS datasets, namely CIFAR-10, CIFAR-100 (Krizhevsky, Hinton et al. 2009) and Image Net (Deng et al. 2009). |
| Dataset Splits | Yes | We split each of these datasets into a training set with 25K images, a validation set with 25K images, and a test set with 10K images. During architecture search in LFM, the training set is used as Dtr and the validation set is used as Dval. |
| Hardware Specification | Yes | The LFM experiments in this paper use A100 for DARTS and A40 for PDARTS. ...trained using two Tesla A100 GPUs... |
| Software Dependencies | No | The paper mentions using specific models and algorithms but does not provide specific version numbers for any software dependencies like programming languages, libraries (e.g., PyTorch, TensorFlow), or solvers. |
| Experiment Setup | Yes | For the architecture of the encoder model, we experimented with Res Net-18 and Res Net-34 (He et al. 2016b). The search algorithm was based on SGD, and the hyperparameters of epochs, initial learning rate, and momentum follow the original implementation of the respective DARTS (Liu, Simonyan, and Yang 2019) and PDARTS (Chen et al. 2021). We use a batch size of 64 for both DARTS and PDARTS. ...We trained the network with a batch size of 96, an epoch num- ber of 600, on a single Tesla v100 GPU. ...with a batch size of 1024 and an epoch number of 250. The initial channel number is set to 48. |