CP-NAS: Child-Parent Neural Architecture Search for 1-bit CNNs
Authors: Li'an Zhuo, Baochang Zhang, Hanlin Chen, Linlin Yang, Chen Chen, Yanjun Zhu, David Doermann
IJCAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments demonstrate that the proposed CP-NAS achieves a comparable accuracy with traditional NAS on both the CIFAR and Image Net databases. It achieves the accuracy of 95.27% on CIFAR-10, 64.3% on Image Net with binarized weights and activations, and a 30% faster search than prior arts. 3 Experiments In this section, we compare our CP-NAS with the state-of-the-art NAS methods and 1-bit CNNs methods on two publicly available datasets: CIFAR-10 [Krizhevsky et al., 2014] and ILSVRC12 Image Net [Russakovsky et al., 2015]. |
| Researcher Affiliation | Academia | Li an Zhuo1 , Baochang Zhang1 , Hanlin Chen1 , Linlin Yang2 , Chen Chen3 , Yanjun Zhu4 and David Doermann4 1School of Automation Science and Electrical Engineering, Beihang University 2University of Bonn 3University of North Carolina at Charlotte 4University at Buffalo {lianzhuo, bczhang, hlchen}@buaa.edu.cn |
| Pseudocode | Yes | Algorithm 1 Child-Parent NAS Input: Training data, Validation data Parameter: Searching hyper-graph: G, K = 8, e(o(i,j) k ) = 0 for all edges Output: Optimal structure α 1: while (K > 1) do 2: for t = 1, ..., T epoch do ... |
| Open Source Code | No | The paper does not include a specific link to source code or an explicit statement about the release of code for the described methodology. |
| Open Datasets | Yes | 3 Experiments In this section, we compare our CP-NAS with the state-of-the-art NAS methods and 1-bit CNNs methods on two publicly available datasets: CIFAR-10 [Krizhevsky et al., 2014] and ILSVRC12 Image Net [Russakovsky et al., 2015]. |
| Dataset Splits | Yes | During the architecture search, the training set of the dataset is divided into two subsets, one for training the network weights and the other for perfomrance evaluation as a validation set. ... Due to the efficient guidance of CP model, we only use 50% of the training set with CIFAR-10 and Image Net for architecture search and 5% of the training set for evaluation, leading to a faster search. |
| Hardware Specification | Yes | In terms of search efficiency, compared with the previous work PC-DARTS [Xu et al., 2019], our CP-NAS is 30% faster (tested on our platform 6 NVIDIA TITAN V GPUs). |
| Software Dependencies | No | The paper states 'All the experiments and models are implemented in Py Torch [Paszke et al., 2017]' but does not provide specific version numbers for PyTorch or any other software dependencies. |
| Experiment Setup | Yes | We use SGD with momentum to optimize the network weights, with an initial learning rate of 0.025 (annealed down to zero following a cosine schedule), a momentum of 0.9, and a weight decay of 5 × 10−4. When we search for the architecture directly on Image Net, we use the same parameters for searching with CIFAR-10 except that the initial learning rate is set to 0.05 and βP is set to 0.33. A larger network of 10 cells ... is trained on CIFAR-10 for 600 epochs with a batch size of 96 ... We use the SGD optimizer with an initial learning rate of 0.025 ... a momentum of 0.9, a weight decay of 3 × 10−4 and a gradient clipping at 5. |