Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Are Graph Neural Networks Optimal Approximation Algorithms?
Authors: Morris Yau, Nikolaos Karalias, Eric Lu, Jessica Xu, Stefanie Jegelka
NeurIPS 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our approach achieves strong empirical results across a wide range of real-world and synthetic datasets against solvers and neural baselines. |
| Researcher Affiliation | Academia | Morris Yau MIT CSAIL EMAIL Nikolaos Karalias MIT CSAIL EMAIL Eric Lu Harvard University EMAIL Jessica Xu Independent researcher formerly at MIT EMAIL Stefanie Jegelka TUM and MIT EMAIL |
| Pseudocode | Yes | Empirically, Opt GNN is simple to implement (see pseudocode in Appendix 5) and For pseudocode, please refer to the appendix (Max-Cut: algorithm 4 and general SDPs: algorithm 5). |
| Open Source Code | Yes | Code is available at: https://github.com/penlu/bespoke-gnn4do. |
| Open Datasets | Yes | Our real-world datasets are ENZYMES, PROTEINS, MUTAG, IMDB-BINARY, COLLAB (which we will together call TU-small), and REDDIT-BINARY, REDDIT-MULTI-5K, and REDDIT-MULTI-12K (which we will call TU-REDDIT). and We also tested Opt GNN on the GSET benchmark instances (Benlic & Hao, 2013). |
| Dataset Splits | Yes | For each dataset we hold out a validation and test slice for evaluation. In our generated graph experiments we set aside 1000 graphs each for validation and testing. (...) For TU-small, we used a train/validation/test split of 0.8/0.1/0.1. For TU-REDDIT, we set aside 100 graphs each for validation and testing. |
| Hardware Specification | Yes | Our training runs used 20 cores of an Intel Xeon Gold 6248 (for data loading and random graph generation) and a NVIDIA Tesla V100 GPU. Our Gurobi runs use 8 threads on a Intel Xeon Platinum 8260. Our Ka MIS runs use an Intel Core i9-13900H. Our Lw D and DGL-TREESEARCH runs use an Intel Core i9-13900H and an RTX 4060. |
| Software Dependencies | No | Our Gurobi runs use 8 threads on a Intel Xeon Platinum 8260. and with a standard automatic differentiation package like Py Torch (Paszke et al., 2019). |
| Experiment Setup | Yes | We ran each experiment on a range of hyperparameters. See Figure 5 for the hyperparameter listing. For all training runs, we used the Adam optimizer Kingma & Ba (2014) with a learning rate of 0.001. (...) For Max-3-SAT, we set the penalty term ρ = 0.003. |