reproducibilityindex.ai

Robust Offline Active Learning on Graphs

Authors: Yuanchen Wu, Yubai Yuan

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive numerical experiments show that the proposed method is competitive with existing graph-based active learning methods, especially when node covariates and responses contain noises. Additionally, the proposed method is applicable to both regression and classification tasks on graphs. We provide a theoretical guarantee for the effectiveness of the proposed method in semi-supervised learning tasks. Our theoretical results also highlight an interesting trade-off between informativeness and representativeness in graph-based active learning.
Researcher Affiliation	Academia	Yuanchen Wu Department of Statistics The Pennsylvania State University yqw5734@psu.edu Yubai Yuan Department of Statistics The Pennsylvania State University yvy5509@psu.edu
Pseudocode	Yes	Algorithm 1 Biased Sampling Query Algorithm
Open Source Code	Yes	The implementation code for the proposed algorithm is available at github.com/Yuanchen Wu/Robust Active Learning/.
Open Datasets	Yes	We evaluate the proposed method for node classification tasks on real-world datasets, which include five networks with varying homophily levels (high to low: Cora, Pub Med, Citeseer, Chameleon and Texas) and two large-scale networks (Ogbn-Arxiv and Co-Physics)... We use open-source datasets that can be readily downloaded online.
Dataset Splits	No	The paper describes training details like epochs, learning rate, and weight decay, and mentions evaluation on 'unlabeled nodes' or 'test accuracy'. However, it does not explicitly define or specify the percentages or sample counts for training, validation, and test splits needed to reproduce the experiment's data partitioning.
Hardware Specification	No	The paper does not provide specific hardware details such as GPU models, CPU models, or memory specifications used for running the experiments. It only mentions 'Average query time (in seconds)' without linking it to specific hardware.
Software Dependencies	No	The paper states that 'we train a 2-layer SGC model'. However, it does not provide specific version numbers for SGC or any other software libraries or frameworks (e.g., Python, PyTorch, TensorFlow) used in the implementation or experiments.
Experiment Setup	Yes	For the proposed method and all baselines, we train a 2-layer SGC model for a fixed 300 epochs. During training, the initial learning rate is set to 10 2 and weight decay as 10 4.