reproducibilityindex.ai

Deep Insights into Noisy Pseudo Labeling on Graph Data

Authors: Botao WANG, Jia Li, Yang Liu, Jiashun Cheng, Yu Rong, Wenjia Wang, Fugee Tsung

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	extensive experiments demonstrate that the proposed strategy improves graph learning process and outperforms other PL strategies on link prediction and node classification tasks.
Researcher Affiliation	Collaboration	1Hong Kong University of Science and Technology, Hong Kong SAR, China 2Hong Kong University of Science and Technology (Guangzhou), Guangzhou, China 3Tencent AI Lab, Shenzhen, China
Pseudocode	Yes	Algorithm 1: Iterative cautious pseudo labeling.
Open Source Code	Yes	The implementation is open-sourced at https://github.com/AcEbt/CPL.
Open Datasets	Yes	We adopt five public available datasets to evaluate CPL strategy for link prediction, i.e. Cite Seer, Actor, Wiki CS, Twitch PT, and Amazon_Photo, and five datasets for node classification, i.e. Cora, Cite Seer, Pub Med, Amazon_Photo, and Last FMAsia. Detailed statistics are reported in Table 1.
Dataset Splits	Yes	In link prediction task, as there are few PL-based methods, we apply the CPL strategy on three popular models: GAE [12],node2vec [4], SEAL [29] . To reserve sufficient candidate unobserved samples for PL, the dataset is randomly split into 10%,40%,50% for training, validation, and testing.
Hardware Specification	No	No specific hardware details such as GPU models, CPU types, or memory specifications used for running experiments were mentioned in the paper.
Software Dependencies	No	The paper does not provide specific software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions) that would be needed to replicate the experiment.
Experiment Setup	No	The paper mentions using 5 random seeds, setting k for PL samples, and applying single augmentation methods three times. However, it does not provide specific hyperparameters like learning rate, batch size, number of epochs, or optimizer details for their models.