RIM: Reliable Influence-based Active Learning on Graphs

Authors: Wentao Zhang, Yexin Wang, Zhenbang You, Meng Cao, Ping Huang, Jiulong Shan, Zhi Yang, Bin CUI

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We now verify the effectiveness of RIM on four real-world graphs. We aim to answer four questions. Q1: Compared with other state-of-the-art baselines, can RIM achieve better predictive accuracy? Q2: How does the influence quality and quantity influence RIM? Q3: Is RIM faster than the compared baselines in the end-to-end AL process? Q4: If RIM is more effective than the baselines, what should be the reason?
Researcher Affiliation Collaboration Wentao Zhang1,2, Yexin Wang1, Zhenbang You1, Meng Cao2, Ping Huang2 Jiulong Shan2, Zhi Yang1,3, Bin Cui1,3,4 1School of CS, Peking University 2Apple 3 National Engineering Laboratory for Big Data Analysis and Applications 4Institute of Computational Social Science, Peking University (Qingdao), China
Pseudocode Yes Algorithm 1: Batch Node Selection. Input: Initial labeled set V0, query batch size b, and labeling accuracy α. Output: Labeled set Vl
Open Source Code Yes The implementation details are shown in Appendix A.4, and our code is available in the supplementary material.
Open Datasets Yes We use node classification tasks to evaluate RIM in both inductive and transductive settings [11] on three citation networks (i.e., Citeseer, Cora, and Pub Med) [16] and one large social network (Reddit).
Dataset Splits Yes Suppose that the entire node set V is partitioned into training set Vtrain (including both the labeled set Vl and unlabeled set Vu), validation set Vval and test set Vtest.
Hardware Specification Yes All experiments run on a machine with 8 NVIDIA Tesla V100 32GB GPUs.
Software Dependencies No The paper mentions using 'Open Box [19] for hyper-parameter tuning' but does not specify the version of Open Box or any other critical software dependencies such as deep learning frameworks (e.g., PyTorch, TensorFlow) with their specific version numbers.
Experiment Setup Yes For the hyperparameters, we follow the original paper for LP, AGE, ANRMAB, and GPA. For RIM, we set the propagation steps k = 2 for GCN and k = 10 for LP. The batch size is set to 20 for citation network and 100 for Reddit. We set the learning rate of GCN to 0.01 for Cora and Citeseer, and 0.005 for PubMed and Reddit. The weight decay is set to 0.0005. The hidden size is 128.