reproducibilityindex.ai

Revisiting Parameter Sharing for Automatic Neural Channel Number Search

Authors: Jiaxing Wang, Haoli Bai, Jiaxiang Wu, Xupeng Shi, Junzhou Huang, Irwin King, Michael Lyu, Jian Cheng

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conduct extensive experiments to study the effects of parameter sharing on channel number search. Besides, the transitionary sharing strategy is shown to achieve a better balance between efﬁcient searching and architecture discrimination. Experimental results on both CIFAR-10 and Image Net datasets show that our approach outperforms a number of competitive counterparts.
Researcher Affiliation	Collaboration	Jiaxing Wang 1,3, Haoli Bai 2, Jiaxiang Wu4, Xupeng Shi5, Junzhou Huang4,6, Irwin King2, Michael Lyu2, Jian Cheng1,3 1NLPR, Institute of Automation, Chinese Academy of Sciences 2 The Chinese University of Hong Kong 3School of Artiﬁcial Intelligence, University of Chinese Academy of Sciences 4Tencent AI Lab 5Northeastern University 6University of Texas at Arlington
Pseudocode	Yes	An overall workﬂow is shown in Algorithm 1 of Appendix A.
Open Source Code	Yes	Code is available at https://github.com/haolibai/APS-channel-search.
Open Datasets	Yes	We conduct experiments on CIFAR-10 [17] and Image Net 2012 [15], following standard data preprocessing techniques in [9, 27].
Dataset Splits	No	The paper mentions evaluating on the "validation set" and "warm-up training," and references "standard data preprocessing techniques," but does not provide explicit percentages, sample counts, or citations for the training, validation, and test dataset splits to reproduce the data partitioning.
Hardware Specification	Yes	To be consistent with [6], the total searching epoch is set to 600, which can be ﬁnished within 6.9 hours for Res Net-20 and 8.6 hours for Res Net-56 on a single NVIDIA Tesla-P40. ... The whole searching process can be ﬁnished within 24 hours for Res Net-18 and 48 hours for Mobile Net-v2 on four NVIDIA Tesla-P40s.
Software Dependencies	No	The paper does not provide specific software dependencies with version numbers (e.g., Python, PyTorch, CUDA versions) needed to replicate the experiments. It implicitly refers to frameworks common in deep learning but lacks concrete version details.
Experiment Setup	Yes	A brief summarization of experimental setup is introduced below, while complete hyper-parameter settings and implementation details can be found in Appendix C. CIFAR-10 Experiments For CIFAR-10, we take Res Net [9] as base models similar to [6, 12]. To be consistent with [6], the total searching epoch is set to 600, which can be ﬁnished within 6.9 hours for Res Net-20 and 8.6 hours for Res Net-56 on a single NVIDIA Tesla-P40. The ﬁrst 200 epochs are used for warm-up training with ﬁxed P, Q, and candidate architectures are uniformly sampled from C. The rest 400 epochs are left for transition and training of the RL controller. We set C = {16, 32, 64, 96} for the analysis of parameter sharing in Section 5.2 and 100% FLOPs search, and C = {4, 8, 16, 32, 64} when searching for more compact model to compare to other baselines in Section 5.3. ... Image Net Experiments For Image Net experiments, we choose Res Net-18 and Mobile Net-v2 as base models. For memory efﬁciency, we increase candidate channels after each down-sampling layer according to default expansion rates of base models. The initial candidates C are set to {32, 48, 64, 80} for Res Net18 and {8, 12, 16, 20} for Mobile Net-v2 respectively. We search for 160 epochs where the ﬁrst 80 epochs are for warm-up training.