reproducibilityindex.ai

Fast Neural Network Adaptation via Parameter Remapping and Architecture Search

Authors: Jiemin Fang*, Yuzhu Sun*, Kangjian Peng*, Qian Zhang, Yuan Li, Wenyu Liu, Xinggang Wang

ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In our experiments, we conduct FNA on Mobile Net V2 to obtain new networks for both segmentation and detection that clearly out-perform existing networks designed both manually and by NAS.
Researcher Affiliation	Collaboration	Jiemin Fang1 , Yuzhu Sun1 , Kangjian Peng2 , Qian Zhang2, Yuan Li2, Wenyu Liu1, Xinggang Wang1 1School of EIC, Huazhong University of Science and Technology 2Horizon Robotics
Pseudocode	Yes	Algorithm 1: Weights Remapping Function
Open Source Code	Yes	The code is available at https://github.com/Jamin Fong/FNA.
Open Datasets	Yes	The semantic segmentation experiments are conducted on the Cityscapes (Cordts et al., 2016) dataset. ... The experiments are conducted on the MS-COCO dataset (Lin et al., 2014b).
Dataset Splits	Yes	In the architecture adaptation process, we randomly sample 20% images from the training set as the validation set for architecture parameters updating. ... In the search process of architecture adaptation, we randomly sample 50% data from the original trainval35k set as the validation set.
Hardware Specification	Yes	The whole search process is conducted on a single V100 GPU and takes only 1.4 hours in total. ... The whole parameter adaptation process is conducted on 4 TITAN-Xp GPUs and takes 100K iterations, which cost only 8.5 hours in total. ... All our experiments on object detection are conducted on TITAN-Xp GPUs.
Software Dependencies	No	The paper mentions software like 'Deep Labv3', 'Retina Net', 'SSDLite', 'MMDetection', 'SGD optimizer', 'Adam optimizer', 'RMSProp optimizer', but does not provide specific version numbers for these or other key software dependencies.
Experiment Setup	Yes	The batch size is set as 16. We use the SGD optimizer with 0.9 momentum and 5 10 4 weight decay for operation weights and the Adam optimizer (Kingma & Ba, 2015) with 4 10 5 weight decay and a ﬁxed learning rate 1 10 3 for architecture parameters.