RIM: Reliable Influence-based Active Learning on Graphs
Authors: Wentao Zhang, Yexin Wang, Zhenbang You, Meng Cao, Ping Huang, Jiulong Shan, Zhi Yang, Bin CUI
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We now verify the effectiveness of RIM on four real-world graphs. We aim to answer four questions. Q1: Compared with other state-of-the-art baselines, can RIM achieve better predictive accuracy? Q2: How does the influence quality and quantity influence RIM? Q3: Is RIM faster than the compared baselines in the end-to-end AL process? Q4: If RIM is more effective than the baselines, what should be the reason? |
| Researcher Affiliation | Collaboration | Wentao Zhang1,2, Yexin Wang1, Zhenbang You1, Meng Cao2, Ping Huang2 Jiulong Shan2, Zhi Yang1,3, Bin Cui1,3,4 1School of CS, Peking University 2Apple 3 National Engineering Laboratory for Big Data Analysis and Applications 4Institute of Computational Social Science, Peking University (Qingdao), China |
| Pseudocode | Yes | Algorithm 1: Batch Node Selection. Input: Initial labeled set V0, query batch size b, and labeling accuracy α. Output: Labeled set Vl |
| Open Source Code | Yes | The implementation details are shown in Appendix A.4, and our code is available in the supplementary material. |
| Open Datasets | Yes | We use node classification tasks to evaluate RIM in both inductive and transductive settings [11] on three citation networks (i.e., Citeseer, Cora, and Pub Med) [16] and one large social network (Reddit). |
| Dataset Splits | Yes | Suppose that the entire node set V is partitioned into training set Vtrain (including both the labeled set Vl and unlabeled set Vu), validation set Vval and test set Vtest. |
| Hardware Specification | Yes | All experiments run on a machine with 8 NVIDIA Tesla V100 32GB GPUs. |
| Software Dependencies | No | The paper mentions using 'Open Box [19] for hyper-parameter tuning' but does not specify the version of Open Box or any other critical software dependencies such as deep learning frameworks (e.g., PyTorch, TensorFlow) with their specific version numbers. |
| Experiment Setup | Yes | For the hyperparameters, we follow the original paper for LP, AGE, ANRMAB, and GPA. For RIM, we set the propagation steps k = 2 for GCN and k = 10 for LP. The batch size is set to 20 for citation network and 100 for Reddit. We set the learning rate of GCN to 0.01 for Cora and Citeseer, and 0.005 for PubMed and Reddit. The weight decay is set to 0.0005. The hidden size is 128. |