Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Training Robust Graph Neural Networks by Modeling Noise Dependencies

Authors: Yeonjun In, Kanghoon Yoon, Sukwon Yun, Kibum Kim, Sungchul Kim, Chanyoung Park

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	5 Experiments Datasets. We evaluate DA-GNN and baselines on five commonly used benchmark datasets and two newly introduced datasets, Auto and Garden, which are generated upon Amazon review data [30, 31] to mimic DANG on e-commerce systems (Refer to Appendix E.2.2 for details). The details of the datasets are given in Appendix E.1. Experimental Details. We evaluated DA-GNN in both node classification and link prediction tasks, comparing it with noise-robust GNNs and generative GNN methods. For a thorough evaluation, we create synthetic and real-world DANG benchmark datasets, with details in Appendix E.2. We also account for other noise scenarios, commonly considered in this field, following [8, 5, 10]. Further details about the baselines, evaluation protocol, and implementation details can be found in Appendix E.3, E.4, and E.5, respectively. Table 1: Node classification accuracy (%) under synthetic DANG. OOM indicates out of memory on 24GB RTX3090.
Researcher Affiliation	Collaboration	Yeonjun In1, Kanghoon Yoon1, Sukwon Yun2, Kibum Kim1, Sungchul Kim3 Chanyoung Park1 1KAIST 2UNC Chapel Hill 3Adobe Research EMAIL EMAIL EMAIL
Pseudocode	Yes	The overall architecture and detailed algorithm of DA-GNN are provided in Fig 3 and Algorithm 1 in Appendix, respectively.
Open Source Code	Yes	Our code is available at https://github.com/yeonjun-in/torch-DA-GNN. Question: Does the paper provide open access to the data and code, with sufficient instructions to faithfully reproduce the main experimental results, as described in supplemental material? Answer: [Yes] Justification: We provide our source code including data and running code in the anonymous github repository.
Open Datasets	Yes	Datasets. We evaluate DA-GNN and baselines on five commonly used benchmark datasets and two newly introduced datasets, Auto and Garden, which are generated upon Amazon review data [30, 31] to mimic DANG on e-commerce systems (Refer to Appendix E.2.2 for details). The details of the datasets are given in Appendix E.1. Cora: https://github.com/Chandler Bang/Pro-GNN/ Citeseer: https://github.com/Chandler Bang/Pro-GNN/ Photo: https://pytorch-geometric.readthedocs.io/en/latest/ Computers: https://pytorch-geometric.readthedocs.io/en/latest/ Arxiv: https://ogb.stanford.edu/docs/nodeprop/#ogbn-arxiv Auto: http://jmcauley.ucsd.edu/data/amazon/links.html Garden: http://jmcauley.ucsd.edu/data/amazon/links.html
Dataset Splits	Yes	E.4 Evaluation Protocol We conduct both the node classification and link prediction tasks. For node classification, we perform a random split of the nodes, dividing them into a 1:1:8 ratio for training, validation, and testing nodes. Once a model is trained on the training nodes, we use the model to predict the labels of the test nodes. Regarding link prediction, we partition the provided edges into a 7:3 ratio for training and testing edges.
Hardware Specification	Yes	Table 1: Node classification accuracy (%) under synthetic DANG. OOM indicates out of memory on 24GB RTX3090.
Software Dependencies	No	E.5 Implementation Details For DA-GNN, the learning rate is tuned from {0.01, 0.005, 0.001, 0.0005}, and dropout rate and weight decay are fixed to 0.6 and 0.0005, respectively. In the inference of ZA, we use a 2-layer GCN model with 64 hidden dimension as GCNϕ1 and the dimension of node embedding d1 is fixed to 64. Explanation: The paper describes model architecture and hyperparameters but does not specify software libraries or frameworks with version numbers.
Experiment Setup	Yes	E.5 Implementation Details For each experiment, we report the average performance of 3 runs with standard deviations. For all baselines, we use the publicly available implementations and follow the implementation details presented in their original papers. For DA-GNN, the learning rate is tuned from {0.01, 0.005, 0.001, 0.0005}, and dropout rate and weight decay are fixed to 0.6 and 0.0005, respectively. In the inference of ZA, we use a 2-layer GCN model with 64 hidden dimension as GCNϕ1 and the dimension of node embedding d1 is fixed to 64. The γ value in calculating γ-hop subgraph similarity is tuned from {0, 1} and k in generating k-NN graph is tuned from {0, 10, 50, 100, 300}. In the inference of ZY , we use a 2-layer GCN model with 128 hidden dimension as GCNϕ3. In the inference of ϵX, the hidden dimension size of ϵX, i.e., d2, is fixed to 16. In the inference of ϵA, the early-learning phase is fixed to 30 epochs. In the implementation of the loss term EZA qϕ1Eϵ qϕ2 [log(pθ1(A\|X, ϵ, ZA))], we tune the θ1 value from {0.1, 0.2, 0.3}. In the overall learning objective, i.e., Eqn 4, λ1 is tuned from { 0.003, 0.03. 0.3, 3, 30 }, λ2 is tuned from { 0.003, 0.03. 0.3 }, and λ3 is fixed to 0.001. We report the details of hyperparameter settings in Table 6.