Robust Offline Active Learning on Graphs
Authors: Yuanchen Wu, Yubai Yuan
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive numerical experiments show that the proposed method is competitive with existing graph-based active learning methods, especially when node covariates and responses contain noises. Additionally, the proposed method is applicable to both regression and classification tasks on graphs. We provide a theoretical guarantee for the effectiveness of the proposed method in semi-supervised learning tasks. Our theoretical results also highlight an interesting trade-off between informativeness and representativeness in graph-based active learning. |
| Researcher Affiliation | Academia | Yuanchen Wu Department of Statistics The Pennsylvania State University yqw5734@psu.edu Yubai Yuan Department of Statistics The Pennsylvania State University yvy5509@psu.edu |
| Pseudocode | Yes | Algorithm 1 Biased Sampling Query Algorithm |
| Open Source Code | Yes | The implementation code for the proposed algorithm is available at github.com/Yuanchen Wu/Robust Active Learning/. |
| Open Datasets | Yes | We evaluate the proposed method for node classification tasks on real-world datasets, which include five networks with varying homophily levels (high to low: Cora, Pub Med, Citeseer, Chameleon and Texas) and two large-scale networks (Ogbn-Arxiv and Co-Physics)... We use open-source datasets that can be readily downloaded online. |
| Dataset Splits | No | The paper describes training details like epochs, learning rate, and weight decay, and mentions evaluation on 'unlabeled nodes' or 'test accuracy'. However, it does not explicitly define or specify the percentages or sample counts for training, validation, and test splits needed to reproduce the experiment's data partitioning. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU models, CPU models, or memory specifications used for running the experiments. It only mentions 'Average query time (in seconds)' without linking it to specific hardware. |
| Software Dependencies | No | The paper states that 'we train a 2-layer SGC model'. However, it does not provide specific version numbers for SGC or any other software libraries or frameworks (e.g., Python, PyTorch, TensorFlow) used in the implementation or experiments. |
| Experiment Setup | Yes | For the proposed method and all baselines, we train a 2-layer SGC model for a fixed 300 epochs. During training, the initial learning rate is set to 10 2 and weight decay as 10 4. |