Neural Inheritance Relation Guided One-Shot Layer Assignment Search
Authors: Rang Meng, Weijie Chen, Di Xie, Yuan Zhang, Shiliang Pu5158-5165
AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Comprehensive experiments carried out on CIFAR-100 illustrate the efficiency of our proposed method. Our search results are strongly consistent with the optimal ones directly selected from the architecture dataset. To further confirm the generalization of our proposed method, we also conduct experiments on Tiny Image Net and Image Net. |
| Researcher Affiliation | Collaboration | Rang Meng,1 Weijie Chen,2 Di Xie,2 Yuan Zhang,2 Shiliang Pu2 1College of Control Science and Engineering, Zhejiang University 2Hikvision Research Institute r meng@zju.edu.cn, {chenweijie5, xiedi, zhangyuan, pushiliang}@hikvision.com |
| Pseudocode | Yes | Algorithm 1: Layer Assignment Search Algorithm |
| Open Source Code | No | Bringing this question, we build a neural architecture dataset of different layer assignments, which consists of 908 different neural networks trained on CIFAR-100, including plain networks and residual networks(we will release later). |
| Open Datasets | Yes | Benchmark Datasets CIFAR-100(Krizhevsky, Hinton, and others 2009) is a dataset for 100-classes image classification. ... Tiny-Image Net is a subset of Image Net for 200-classes image classification. ... Image Net(Russakovsky et al. 2015) is a 1000-classes image classification dataset... |
| Dataset Splits | Yes | There are 500 training images and 100 testing images per class with resolution 32 32. ... There are 500 training images, 50 validation images and 50 testing images per class with resolution 64 64. |
| Hardware Specification | Yes | We totally use 7 GPUs for training. ... TITAN XP |
| Software Dependencies | No | Not found. The paper mentions "Pytorch" but does not provide specific version numbers for any software dependencies. |
| Experiment Setup | Yes | During training phase, we first zero-pad the images with 4 pixels on each side and then randomly crop them to produce 32 32 images, followed by randomly horizontal flipping. We normalize them by channel means subtraction and standard deviations division for both training dataset and validation dataset. During building an architecture dataset of layer assignment, we train all the enumerated networks in Pytorch. using SGD with Nesterov momentum 0.9. The base learning rate is set to 0.1 and multiplied with a factor 0.2 at 60 epochs, 120 epochs and 160 epochs, respectively. Weight decay is set as 0.0005. All the networks are trained with batch size 128 for 200 epochs. |