reproducibilityindex.ai

NAT: Neural Architecture Transformer for Accurate and Compact Architectures

Authors: Yong Guo, Yin Zheng, Mingkui Tan, Qi Chen, Jian Chen, Peilin Zhao, Junzhou Huang

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	To verify the effectiveness of the proposed strategies, we apply NAT on both hand-crafted architectures and NAS based architectures. Extensive experiments on two benchmark datasets, i.e., CIFAR-10 and Image Net, demonstrate that the transformed architecture by NAT signiﬁcantly outperforms both its original form and those architectures optimized by existing methods. 4 Experiments In this section, we apply NAT on both hand-crafted and NAS based architectures, and conduct experiments on two image classiﬁcation benchmark datasets, i.e., CIFAR-10 [22] and Image Net [8].
Researcher Affiliation	Collaboration	Yong Guo , Yin Zheng , Mingkui Tan , Qi Chen, Jian Chen , Peilin Zhao, Junzhou Huang South China University of Technology, Weixin Group, Tencent, Tencent AI Lab, University of Texas at Arlington
Pseudocode	Yes	Algorithm 1 Training method for Neural Architecture Transformer (NAT).
Open Source Code	Yes	The source code of NAT is available at https://github.com/guoyongcs/NAT.
Open Datasets	Yes	Extensive experiments on two benchmark datasets, i.e., CIFAR-10 [22] and Image Net [8] demonstrate that the transformed architecture by NAT signiﬁcantly outperforms both its original form and those architectures optimized by existing methods.
Dataset Splits	Yes	We split CIFAR-10 training set into 40% and 60% slices to train the model parameters w and the transformer parameters θ, respectively.
Hardware Specification	No	The paper does not specify the hardware used to run the experiments, such as specific GPU or CPU models.
Software Dependencies	No	The paper states 'All implementations are based on Py Torch.' but does not provide a specific version number for PyTorch or any other software dependencies.
Experiment Setup	Yes	During training, we build the deep network by stacking 8 basic cells and train the transformer for 100 epochs. We set m = 1, n = 1, and λ = 0.003 in the training. We split CIFAR-10 training set into 40% and 60% slices to train the model parameters w and the transformer parameters θ, respectively. For all the considered architectures, we follow the same settings of the original papers.