Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Semi-Supervised Neural Architecture Search
Authors: Renqian Luo, Xu Tan, Rui Wang, Tao Qin, Enhong Chen, Tie-Yan Liu
NeurIPS 2020 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | On NASBench-101 benchmark dataset, it achieves comparable accuracy with gradientbased method while using only 1/7 architecture-accuracy pairs. 2) It achieves higher accuracy under the same computational cost. It achieves 94.02% test accuracy on NASBench-101, outperforming all the baselines when using the same number of architectures. On Image Net, it achieves 23.5% top-1 error rate (under 600M FLOPS constraint) using 4 GPU-days for search. We further apply it to LJSpeech text to speech task and it achieves 97% intelligibility rate in the low-resource setting and 15% test error rate in the robustness setting, with 9%, 7% improvements over the baseline respectively. |
| Researcher Affiliation | Collaboration | 1University of Science and Technology of China, Hefei, China 2Microsoft Research Asia, Beijing, China |
| Pseudocode | Yes | Algorithm 1 Semi-Supervised Neural Architecture Search |
| Open Source Code | No | The paper provides links to open-source code for baseline methods (NAO, Proxyless NAS) but does not explicitly state that the code for their proposed method (Semi NAS) is open-source or provide a link for it. |
| Open Datasets | Yes | Dataset NASBench-101 [37] designs a cell-based search space following the common practice [42, 17, 15]. It includes 423, 624 CNN architectures and trains each architecture CIFAR-10 for 3 times. [...] We conduct experiments on the LJSpeech dataset [10] which contains 13100 text and speech data pairs with approximately 24 hours of speech audio. |
| Dataset Splits | Yes | We randomly sample 50, 000 images from the training data as valid set for architecture search. |
| Hardware Specification | Yes | The search runs for 1 day on 4 V100 GPUs. [...] The search runs for 1 day on 4 P40 GPUs. |
| Software Dependencies | No | The paper mentions software components like Adam optimizer, SGD optimizer, and LSTM networks but does not provide specific version numbers for any software dependencies (e.g., Python, PyTorch, TensorFlow, CUDA). |
| Experiment Setup | Yes | We use Adam optimizer with a learning rate of 0.001. [...] We train the supernet on 4 GPUs for 20000 steps with a batch size of 128 per card. [...] The discovered architecture is trained for 300 epochs with a total batch size of 256. We use the SGD optimizer with an initial learning rate of 0.05 and a cosine learning rate schedule [16]. |