Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Quantifying Distributional Invariance in Causal Subgraph for IRM-Free Graph Generalization
Authors: Yang Qiu, Yixiong Zou, Jun Wang, Wei Liu, Xiangyu Fu, Ruixuan Li
NeurIPS 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on two widely used benchmarks demonstrate that our method consistently outperforms state-of-the-art methods in graph generalization. |
| Researcher Affiliation | Collaboration | 1School of Computer Science and Technology, Huazhong University of Science and Technology, 2i Wudao Tech |
| Pseudocode | No | The paper describes the methodology in Section 3 and illustrates the framework in Figure 4, but does not include any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code is available at https: //github.com/anders1123/IDG. |
| Open Datasets | Yes | We adopt two widely used benchmarks for graph OOD generalization Graph OOD [8] and Drug OOD [15], across seven datasets: Motif, CMNIST, HIV, SST2, and Twitter from Graph OOD, and EC50 and IC50 from Drug OOD. |
| Dataset Splits | Yes | Each dataset contains one or more domains and is divided into domain-based splits, thereby introducing distribution shifts. ... As in prior work, we partition each dataset by its domain attribute to induce distribution shifts. For example, in the Motif basis-shift setting, the motif types in the test set are entirely disjoint from those in the training and validation sets, thus rigorously assessing model generalization. |
| Hardware Specification | Yes | Experiments in this paper are conducted on NVIDIA RTX3090 GPUs. |
| Software Dependencies | No | The paper mentions employing GIN as the backbone and describes optimization details, but it does not specify version numbers for any software components (e.g., Python, PyTorch, CUDA, GIN library version). |
| Experiment Setup | Yes | Following [9], we employ GIN for both the extractor and predictor, set (λ1, λ2) = (0.1, 0.01), and retain the original learning-rate and batch-size settings. |