Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Calibrate and Debias Layer-wise Sampling for Graph Convolutional Networks

Authors: Yifan Chen, Tianning Xu, Dilek Hakkani-Tur, Di Jin, Yun Yang, Ruoqing Zhu

TMLR 2023 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	The improvements are demonstrated by extensive analyses of estimation variance and experiments on common benchmarks. Code and algorithm implementations are publicly available at https://github.com/ychen-stat-ml/GCN-layer-wise-sampling. We conduct empirical experiments on 5 large real-world datasets to ensure the fair comparison and the representative results. To study the inﬂuence of the aforementioned issues, we evaluate the matrix approximation error (c.f. Section 4.3 and Figure 2) of diﬀerent methods in one-step propagation. In Section 6 we further evaluate the prediction accuracy on testing sets of both intermediate models during training and ﬁnal outputs, using the metrics in Table 1.
Researcher Affiliation	Collaboration	Yifan Chen1 Tianning Xu1 Dilek Hakkani-Tur2 Di Jin2 Yun Yang1 Ruoqing Zhu1 1 University of Illinois Urbana-Champaign 2 Amazon Alexa AI
Pseudocode	Yes	Algorithm 1: Iterative updates of coeﬃcients to construct the ultimate debiased estimator Ys.
Open Source Code	Yes	Code and algorithm implementations are publicly available at https://github.com/ychen-stat-ml/GCN-layer-wise-sampling.
Open Datasets	Yes	The datasets (see details in Table 1) involve: Reddit (Hamilton et al., 2017), ogbn-arxiv, ogbn-proteins, ogbn-mag, and ogbn-products (Hu et al., 2020). Reddit is a traditional large graph dataset used by Chen et al. (2018b); Zou et al. (2019); Chen et al. (2018a); Cong et al. (2020); Zeng et al. (2020). Ogbn-arxiv, ogbn-proteins, ogbn-mag, and ogbn-products are proposed in Open Graph Benchmarks (OGB) by Hu et al. (2020).
Dataset Splits	Yes	Table 1: Summary of datasets. ... Split Ratio refers to the ratio of training/validation/test data. Dataset Nodes Edges Deg. Feat. Classes Tasks Split Ratio Metric Reddit 232,965 11,606,919 50 602 41 1 66/10/24 F1-score ogbn-arxiv 160,343 1,166,243 13.7 128 40 1 54/18/28 Accuracy ogbn-proteins 132,534 39,561,252 597.0 8 binary 112 65/16/19 ROC-AUC ognb-mag 736,389 5,396,336 7.3 128 349 1 85/9/6 Accuracy ogbn-products 2,449,029 61,859,140 50.5 100 47 1 8/2/90 Accuracy
Hardware Specification	Yes	We use one Tesla V100 SXM2 16GB GPU with 10 CPU threads to train all the models listed in Section 6.
Software Dependencies	No	All the models are implemented by Py Torch.
Experiment Setup	Yes	In training, we use a 2-layer GCN for each task trained with an ADAM optimizer. ... The number of hidden variables is 256 and the batch size is 512. For layer-wise sampling methods, we consider two settings for node sample size: 1. ﬁxed as 512 (equal to the batch size); 2. an increasing setting (denoted with a suﬃx (2)) that double nodes will be sampled in the next layer. For node-wise sampling methods (Graph SAGE, VR-GCN), the sample size per node is 2 (denoted with a suﬃx (2)). For the subgraph sampling method Graph SAINT, the subgraph size is by default equal to the batch size. ... For the details in model training, the learning rate is 0.001 and the dropout rate is 0.2, which means 20 percents of hidden units are randomly dropped during the training.