DATA: Differentiable ArchiTecture Approximation
Authors: Jianlong Chang, xinbang zhang, Yiwen Guo, GAOFENG MENG, SHIMING XIANG, Chunhong Pan
NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on a variety of popular datasets strongly evidence that our method is capable of discovering high-performance architectures for image classification, language modeling and semantic segmentation, while guaranteeing the requisite efficiency during searching. |
| Researcher Affiliation | Collaboration | 1NLPR, Institute of Automation, Chinese Academy of Sciences 2School of Artificial Intelligence, University of Chinese Academy of Sciences 3Samsung Research China Beijing, 4Intel Labs China, 5Bytedance AI Lab {jianlong.chang, xinbang.zhang, gfmeng, smxiang, chpan}@nlpr.ia.ac.cn guoyiwen.ai@bytedance.com |
| Pseudocode | No | The paper describes algorithms verbally and with equations, but does not contain any clearly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | Specifically, the core code of DATA is released at https://github.com/Xinbang Zhang/DATA-NAS. |
| Open Datasets | Yes | First, the cell architectures are searched based on our EGS estimator and the best cells are found according to their validation performance. Second, the transferability of the best cells learned on CIFAR-10 [28] and Penn Tree Bank (PTB) [51] are investigated by using them on large datasets, i.e., classification on Image Net [10] and language modeling on Wiki Text-2 (WT2) [39], respectively. ... semantic segmentation on the PASCAL VOC-2012. |
| Dataset Splits | Yes | First, the cell architectures are searched based on our EGS estimator and the best cells are found according to their validation performance. ... a large network of 20 cells is trained from scratch for 600 epoches with batch size 96 and report its performance on the test set. |
| Hardware Specification | No | The paper does not explicitly describe the specific hardware (e.g., GPU models, CPU types) used for running experiments. |
| Software Dependencies | No | The paper does not provide specific software names with version numbers for its dependencies or libraries. |
| Experiment Setup | Yes | In EGS, the sampling time M is set to 4 and 7 for a rich search space. During searching, the Re LU-Conv-BN order is utilized... a large network of 20 cells is trained from scratch for 600 epoches with batch size 96... we set cutout with size 16, path dropout of probability 0.2 and auxiliary towers with weight 0.4... An architecture of 14 cells is trained for 250 epoches with batch size 128, weight decay 3 10 5 and poly learning rate scheduler with initial learning rate 0.1. |