reproducibilityindex.ai

Mitigating Label Noise on Graphs via Topological Sample Selection

Authors: Yuhao Wu, Jiangchao Yao, Xiaobo Xia, Jun Yu, Ruxin Wang, Bo Han, Tongliang Liu

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section, we conduct extensive experiments to verify the effectiveness of our method and provide comprehensive ablation studies about the underlying mechanism of TSS.
Researcher Affiliation	Collaboration	1Sydney AI Center, The University of Sydney 2CMIC, Shanghai Jiao Tong University 3Shanghai AI Laboratory 4University of Science and Technology of China 5Alibaba Group 6TMLR Group, Department of Computer Science, Hong Kong Baptist University.
Pseudocode	Yes	We summarize the procedure of TSS in Algorithm 1 of the Appendix.
Open Source Code	No	The paper does not contain any explicit statement or link indicating that the source code for the described methodology is publicly available.
Open Datasets	Yes	Datasets We adopted three small datasets including Cora, Cite Seer, and Pub Med, with the default dataset split as did in (Chen et al., 2018), and four large datasets: Wiki CS, Facebook, Physics and DBLP to evaluate our method.
Dataset Splits	Yes	We adopted three small datasets including Cora, Cite Seer, and Pub Med, with the default dataset split as did in (Chen et al., 2018)... All hyper-parameters are tuned based on a noisy validation set built by leaving 10% noisy training data.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., GPU/CPU models, memory, or cloud computing instance types) used for running the experiments.
Software Dependencies	No	The paper mentions general software components like 'Adam optimizer' and 'two-layer graph convolutional network' but does not specify their version numbers or the versions of any other key software libraries (e.g., Python, PyTorch, TensorFlow, etc.) used for reproducibility.
Experiment Setup	Yes	A two-layer graph convolutional network whose hidden dimension is 16 is deployed as the backbone for all methods. We apply an Adam optimizer (Kingma and Ba, 2014) with a learning rate of 0.01. The weight decay is set to 5 10 4. The number of pre-training epochs is set to 400. While the number of retraining epochs is set to 500 for Cora, Cite Seer, and 1000 for Pubmed, Wiki CS, Facebook, Physics and DBLP. All hyper-parameters are tuned based on a noisy validation set built by leaving 10% noisy training data.