Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Learning to Flow from Generative Pretext Tasks for Neural Architecture Encoding
Authors: Sunwoo Kim, Hyunjin Hwang, Kijung Shin
NeurIPS 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiments show that FGP boosts encoder performance by up to 106% in Precision@1%, compared to the same encoder trained solely with supervised learning. We demonstrate the effectiveness of our pre-training method compared to baseline pre-training methods across multiple downstream tasks, including performance prediction and neural architecture search. |
| Researcher Affiliation | Academia | 1Kim Jaechul Graduate School of AI, 2School of Electrical Engineering Korea Advanced Institute of Science and Technology (KAIST) {kswoo97, julia510, kijungs} @ kaist.ac.kr |
| Pseudocode | No | The paper describes the proposed method and steps for obtaining the flow surrogate in Section 3 and its subsections, along with a visual representation in Figure 3. However, there are no explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | Our code and datasets are available at https://github.com/kswoo97/FGPAnom. |
| Open Datasets | Yes | We leverage three computer vision neural architecture datasets, which are NAS-Bench-101 (NB-101) [51], NAS-Bench-201 (NB-201) [8], and NAS-Bench-301 (NB-301) [39] datasets. Our code and datasets are available at https://github.com/kswoo97/FGPAnom. |
| Dataset Splits | Yes | For NB-101 and NB-201 datasets, we follow the training and test splits provided in [34, 15]. For the NB-301 dataset, since the baseline method ZC-Proxy [56] requires certain numerical properties of architectures, we use a subset of the original NB-301 dataset where these properties are available. We sample 40 architectures from the test set to create a validation set, following the approach in [15]. |
| Hardware Specification | Yes | We conducted our experiments on a machine with NVIDIA RTX 8000 D6 GPUs (48GB memory) and two Intel Xeon Silver 4214R processors. |
| Software Dependencies | Yes | FGP is primarily implemented using the Pytorch (v1.12.1) and Pytorch Geometric (v2.2.0) libraries. |
| Experiment Setup | Yes | We use Adam W [30] as the learning optimizer. We set the batch size and pre-training epochs to 256 and 200, respectively. Appendix A.4 details hyperparameter tuning, including Learning rate within {10^-3, 5*10^-4, 10^-4}, Projection head dimension within {32, 64, 128, 256}, Projection head number of layers within {1, 2, 3}, Weight decay within {10^-6, 0.0}, and for FGP, (λ1, λ2) within {(1/2, 1/2), (1/3, 2/3), (2/3, 1/3)}. |