Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Serving Graph Compression for Graph Neural Networks

Authors: Si Si, Felix Yu, Ankit Singh Rawat, Cho-Jui Hsieh, Sanjiv Kumar

ICLR 2023 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results on semi-supervised node classification demonstrate that the proposed method can significantly reduce the serving space requirement for GNN inference. (Abstract) 4 EXPERIMENTAL RESULTS (Section title)
Researcher Affiliation	Collaboration	1Google Research 2University of California, Los Angeles EMAIL {chohsieh}@ucla.cs.edu
Pseudocode	Yes	Algorithm 1 The Virtual Node Graph (VNG) algorithm
Open Source Code	No	The paper mentions using a third-party open-source implementation for GCN model training ("Cluster GCN s tensorflow implementation"), but does not state that the code for their proposed VNG method is open-source or provided.
Open Datasets	Yes	All above datasets are publicly available and are commonly used for benchmarking the performance of GNNs on node classification tasks. (Section 4) Arxiv: ... We use the same dataset and partition as in (Hu et al., 2020). (Section 4) Reddit: ... We use the same dataset and partition as in (Chiang et al., 2019). (Section 4) Product: ... based on a different preprocessing and split by Hu et al. (2020). (Section 4) Amazon2M: ... We use the same dataset and partition as in (Chiang et al., 2019). (Section 4)
Dataset Splits	Yes	Table 2: The statistics of Arxiv, Reddit, Product, and Amazon2M datasets. #Training Nodes #Validate Nodes #Labels #Features Serving size Arxiv: ... We use the same dataset and partition as in (Hu et al., 2020). Reddit: ... We use the same dataset and partition as in (Chiang et al., 2019). Product: ... based on a different preprocessing and split by Hu et al. (2020). Amazon2M: ... We use the same dataset and partition as in (Chiang et al., 2019).
Hardware Specification	No	The paper does not explicitly state the specific hardware, such as GPU or CPU models, used for running the experiments.
Software Dependencies	No	The paper mentions 'Cluster GCN s tensorflow implementation' but does not specify a version number for TensorFlow or any other software dependencies.
Experiment Setup	Yes	As for architecture, on all datasets, we consider a 4-layer GCN model with hidden dimensions 512, 256, 512, and 400 for Product, Arxiv, Reddit, and Amazon2M, respectively, and the mean aggregator from Hamilton et al. (2017).