Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Hephaestus: Mixture Generative Modeling with Energy Guidance for Large-scale QoS Degradation

Authors: Nguyen Do, Bach Ngo, Youval Kashuv, Canh Pham, Hanghang Tong, My T. Thai

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments on both synthetic and real-world networks show that our approach consistently outperforms classical and ML baselines, particularly in scenarios with nonlinear cost functions where traditional methods fail to generalize.
Researcher Affiliation	Academia	1University of Florida, FL, USA 2The Frazer School, FL, USA 3ORLab, Phenikaa University, Viet Nam 4University of Illinois at Urbana-Champaign, IL, USA
Pseudocode	Yes	Algorithm 1: Hephaestus Main Framework Algorithm 2: Forge Algorithm 3: SPAGAN Training Algorithm 4: Predictive Path Stressing (PPS) Algorithm 5: Morph Algorithm 6: Refine Algorithm 7: Inference Process Algorithm 8: Predictive Path Stressing Inference (PPS-I)
Open Source Code	Yes	An anonymized code is in the supplemental material with scripts for training SPAGAN, Mix-CVAE, EBM, and the RL refinement phase, along with instructions and preprocessed data for both synthetic and real datasets.
Open Datasets	Yes	Real-world datasets include Email [61], Gnutella [62], Road CA [63], and Skitter [64], covering diverse scales and domains.
Dataset Splits	Yes	For synthetic graphs, we generate Erd os Rényi topologies with n = 1024 nodes, varying edge density l, fixed threshold T = 20, and \|K\| = 10 critical pairs. For training the entire Hephaestus framework in both synthetic and real networks, we generate a large corpus of graphs with varying architectures (e.g., Barabási-Albert, Erds-Rényi, and Watts-Strogatz), as well as diverse edge densities and thresholds, while keeping the number of nodes consistent with those in the testing graphs.
Hardware Specification	Yes	All experiments were conducted on a workstation equipped with an Intel Core i9-14900K CPU, 192 GB RAM, and 2 NVIDIA RTX 4090 GPUs (total 48 GB VRAM).
Software Dependencies	No	We compare Hephaestus with a range of baselines, including classical approximation methods and learning-based IP solvers. For learning-based IP methods, we evaluate three SOTA methods for ILP: DIFFILO [41], Predict-and-Search [34], and L-MILPOPT [25], which combine neural predictors with ILP solvers like Gurobi [36] or SCIP [37]. ... All datasets (e.g., Email, Gnutella, Road CA) and tools (e.g., Gurobi, SCIP) are properly cited with references [35], [41], [60 63], and their usage complies with open academic licensing terms.
Experiment Setup	Yes	In Table 3, we provide a detailed summary of the hyperparameters used across the three core phases of the Hephaestus framework: Forge, Morph, and Refine. These settings were chosen through extensive trial runs and empirical tuning to identify the best-performing configurations.