Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
ProfiX: Improving Profile-Guided Optimization in Compilers with Graph Neural Networks
Authors: Huiri Tan, Juyong Jiang, Jiasi Shen
NeurIPS 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on the SPEC 2017 benchmarks demonstrate that PROFIX achieves up to a 9.15% performance improvement compared to the state-of-the-art traditional algorithm and an average 6.26% improvement over the baseline machine learning models. These results highlight the effectiveness of PROFIX in optimizing real-world application profiles. |
| Researcher Affiliation | Academia | 1 The Hong Kong University of Science and Technology 2 The Hong Kong University of Science and Technology (Guangzhou) EMAIL, EMAIL |
| Pseudocode | No | The paper describes the model architecture and mathematical formulas (Eq 1-11) and provides an overview of the model structure in Figure 2, but it does not include a clearly labeled 'Pseudocode' or 'Algorithm' block, nor structured steps formatted like code or an algorithm. |
| Open Source Code | No | Answer: [No] Justification: We will share the code upon request. |
| Open Datasets | Yes | We evaluate PROFIX with a diverse dataset covering compiler toolchains (Clang [29], GCC [50]), database systems (My SQL [11], SQLite [13]), and performance benchmarks (SPEC CPU 20172). ... 2https://www.spec.org/cpu2017/ |
| Dataset Splits | Yes | The processed data is split into training, validation, and test sets with a ratio of 80%/10%/10%. |
| Hardware Specification | Yes | We conduct all training and testing experiments on a server with 2 Intel(R) Xeon(R) Gold 6444Y CPU (16 Cores), 256 GB RAM, and 2 RTX 5880 GPU (48 GB Memory). |
| Software Dependencies | No | We use the Py Torch framework when implementing our model and baselines, cited in Section 1 and Section 4. (Does not specify version number). |
| Experiment Setup | Yes | Table 9: Key hyperparameters for model training. Learning Rate 0.001 Train Batch Size 128 Validate/Test Batch Size 1 Optimizer Adam Weight Decay 0 (No weight decay) Learning Rate Scheduler Step LR (Step Size: 5, Gamma: 0.97) Epochs 300 LSTM Hidden Size 256 SAGE Attention Layers 3 Dropout Rate 0.1 Early Stopping Patience 10 Loss Function RMSE Loss Train-Validate-Test Split Ratio 80%, 10%, 10% |