Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Cross-Domain Graph Data Scaling: A Showcase with Diffusion Models

Authors: Wenzhuo Tang, Haitao Mao, Danial Dervovic, Ivan Brugere, Saumitra Mishra, Yuying Xie, Jiliang Tang

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Empirically, we apply Uni Aug to graphs from diverse domains and consistently observe performance improvement in node classification, link prediction, and graph property prediction. To the best of our knowledge, this study represents the first demonstration of a cross-domain data-scaling graph structure augmentor.
Researcher Affiliation Collaboration 1 Michigan State University 2 Amazon 3 J.P. Morgan AI Research
Pseudocode No The paper describes methods and processes using mathematical formulas and descriptive text, but it does not include explicitly labeled 'Pseudocode' or 'Algorithm' blocks, nor structured code-like steps.
Open Source Code Yes We also provide the code in the supplemental material.
Open Datasets Yes Within the publicly available graph databases, Network Repository [8] provides a comprehensive collection of graphs with varied scales from different domains, such as biological networks, chemical networks, social networks, and many more. In addition, we add a 1000 graphs subset of the Git Hub Star dataset from TUDataset [80] to enlarge the coverage of diverse patterns and form an EXTRA collection.
Dataset Splits Yes Mean and standard deviation of accuracy (%) with 10-fold cross-validation on graph classification.
Hardware Specification Yes Throughout all the experiments, we train all the methods with Adam optimizer on an A100 GPU.
Software Dependencies No The paper does not explicitly list multiple key software components with their specific version numbers (e.g., programming language, libraries, or frameworks) used for implementation, beyond mentioning the Adam optimizer.
Experiment Setup Yes We have mainly four hyperparameters for Uni Aug: step-size γ and regularization strength λ in (5), number of repeats per training graph, and whether augment validation and test graphs with the trained guidance head. For each training graph, we repeatedly generate structures and plug in the original node features for multi-repeat augmentation. We perform the update in (5) for 5 times per each sampling step. The hyperparameters are tuned from the choices in Table 12.