Learning from Mistakes – a Framework for Neural Architecture Search

Authors: Bhanu Garg, Li Zhang, Pradyumna Sridhara, Ramtin Hosseini, Eric Xing, Pengtao Xie10184-10192

AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental The experiments are performed on three popular NAS datasets, namely CIFAR-10, CIFAR-100 (Krizhevsky, Hinton et al. 2009) and Image Net (Deng et al. 2009). Experimental results strongly demonstrate the effectiveness of our model. The results of the classification error(%) of different NAS methods on CIFAR-100 are showed in Table 1. Ablation Studies
Researcher Affiliation Academia 1 University of California, San Diego, USA 2 Zhejiang University, China 3 Carnegie Mellon University, USA 4 Mohamed bin Zayed University of Artificial Intelligence, UAE
Pseudocode Yes The overall algorithm of LFM when applying to other related methods can be summarised in Algorithm 1. And LFM can be applied to any differentiable NAS methods. Algorithm 1: Algorithm for LFM 1: while not converged do 2: Update W1 3: Update W2 4: Update A, V, r respectively 5: end while
Open Source Code No The paper does not provide an explicit statement about releasing code for its methodology or a direct link to a source-code repository.
Open Datasets Yes The experiments are performed on three popular NAS datasets, namely CIFAR-10, CIFAR-100 (Krizhevsky, Hinton et al. 2009) and Image Net (Deng et al. 2009).
Dataset Splits Yes We split each of these datasets into a training set with 25K images, a validation set with 25K images, and a test set with 10K images. During architecture search in LFM, the training set is used as Dtr and the validation set is used as Dval.
Hardware Specification Yes The LFM experiments in this paper use A100 for DARTS and A40 for PDARTS. ...trained using two Tesla A100 GPUs...
Software Dependencies No The paper mentions using specific models and algorithms but does not provide specific version numbers for any software dependencies like programming languages, libraries (e.g., PyTorch, TensorFlow), or solvers.
Experiment Setup Yes For the architecture of the encoder model, we experimented with Res Net-18 and Res Net-34 (He et al. 2016b). The search algorithm was based on SGD, and the hyperparameters of epochs, initial learning rate, and momentum follow the original implementation of the respective DARTS (Liu, Simonyan, and Yang 2019) and PDARTS (Chen et al. 2021). We use a batch size of 64 for both DARTS and PDARTS. ...We trained the network with a batch size of 96, an epoch num- ber of 600, on a single Tesla v100 GPU. ...with a batch size of 1024 and an epoch number of 250. The initial channel number is set to 48.