Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Diffuse, Sample, Project: Plug-And-Play Controllable Graph Generation
Authors: Kartik Sharma, Srijan Kumar, Rakshit Trivedi
ICML 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our extensive experiments demonstrate that PRODIGY empowers state-of-the-art continuous and discrete diffusion models to produce graphs meeting specific, hard constraints. Our approach achieves up to 100% constraint satisfaction for non-attributed and molecular graphs, under a variety of constraints, marking a significant step forward in precise, interpretable graph generation. |
| Researcher Affiliation | Academia | 1Georgia Institute of Technology, Atlanta, GA, USA 2Massachusetts Institute of Technology, Cambridge, MA, USA. |
| Pseudocode | No | The paper describes the proposed method using mathematical equations and textual explanations but does not include any pseudocode or clearly labeled algorithm blocks. |
| Open Source Code | Yes | Code is provided on the project webpage: https: //prodigy-diffusion.github.io. |
| Open Datasets | Yes | We consider five non-attributed graph datasets including three real-world graphs: Community-small, Egosmall, Enzymes (Jo et al., 2022), and two synthetic graphs: SBM, Planar (Martinkus et al., 2022). We also use two standard molecular datasets: QM9 (Ramakrishnan et al., 2014), ZINC250k (Irwin et al., 2012). |
| Dataset Splits | No | The paper mentions hyperparameter tuning ('We tune the hyperparameters to search for the optimal γt in Equation 4 to minimize the trade-off between constraint satisfaction and distributional preservation') but does not specify explicit training/validation/test dataset splits for their own model or for the pre-trained models they utilize. |
| Hardware Specification | No | The paper does not provide specific details about the hardware (e.g., GPU models, CPU types, memory) used to run the experiments. |
| Software Dependencies | No | The paper mentions using 'Pytorch' for efficient matrix operations, but it does not specify a version number for PyTorch or any other software dependencies. |
| Experiment Setup | Yes | We tune the hyperparameters to search for the optimal γt in Equation 4 to minimize the trade-off between constraint satisfaction and distributional preservation. In particular, we searched for the variables involved in these two functional forms, particularly, β [0.1, 1.0, 10.0, 100.0], γ0 [0, 0.1], p [0, 1, 5]. |