Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Theory-Inspired Path-Regularized Differential Network Architecture Search
Authors: Pan Zhou, Caiming Xiong, Richard Socher, Steven Chu Hong Hoi
NeurIPS 2020 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results on image classification tasks validate its advantages. Here we evaluate PR-DARTS on classification task and compare it with representative state-of-the-art NAS approaches |
| Researcher Affiliation | Industry | Pan Zhou Caiming Xiong Richard Socher Steven C.H. Hoi Salesforce Research EMAIL |
| Pseudocode | Yes | See optimization details in Algorithm 1 of Appendix A. |
| Open Source Code | Yes | Code is available at https://panzhous.github.io/. |
| Open Datasets | Yes | Datasets. CIAFR10 [40] and CIFAR100 [40] contain 50K training and 10K test images which are of size 32 32 and distribute over 10 classes in CIFAR10 and 100 classes in CIFAR100. Image Net [41] has 1.28M training and 50K test images which roughly equally distribute over 1K object categories. |
| Dataset Splits | Yes | We divide 50K training samples in CIFAR10 into two equal-sized training and validation datasets. |
| Hardware Specification | Yes | In merely 0.17 GPU-days on Tesla V100, PR-DARTS respectively achieves... |
| Software Dependencies | No | The paper mentions optimizers like SGD and ADAM but does not specify software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions). |
| Experiment Setup | Yes | In PR-DARTS, we set λ1 =0.01, λ2 =0.005, and λ3 =0.005 for regularization. Then we train the network 200 epochs with mini-batch size 128. We set temperature τ =10 and linearly reduce it to 0.1, a= 0.1 and b=1.1. We train the network 600 epochs with a mini-batch size of 128 from scratch. We also use drop-path with probability 0.2 and cutout [46] with length 16, for regularization. |