Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

IntraMix: Intra-Class Mixup Generation for Accurate Labels and Neighbors

Authors: Shenghe Zheng, Hongzhi Wang, Xianglong Liu

NeurIPS 2024 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments demonstrate the effectiveness of Intra Mix across various GNNs and datasets.
Researcher Affiliation Academia Shenghe Zheng, Hongzhi Wang , Xianglong Liu Massive Data Computing Lab, Harbin Institute of Technology EMAIL EMAIL
Pseudocode Yes Algorithm 1 Workflow of Intra Mix
Open Source Code Yes Our code is available at: https://github.com/Zhengsh123/Intra Mix.
Open Datasets Yes Datasets: We evaluate Intra Mix on commonly used medium-scale semi-supervised datasets for node classification, including Cora, Cite Seer, Pubmed [34], CS, and Physics [35]. ... We also conduct semi-supervised experiments on large-scale graphs, including ogbn-arxiv [16] and Flickr [49].
Dataset Splits Yes We follow the original splits for these datasets. We also conduct semi-supervised experiments on large-scale graphs, including ogbn-arxiv [16] and Flickr [49]. To alter the original splits for full-supervised training on these datasets, we use 1% and 5% of the original training data for semi-supervised experiments, respectively. Details can be found in Appendix C.1. (Section 4.1, Semi-supervised Learning) and Table 8 which provides "Split Ratio" (e.g., "8.5/30.5/61 Accuracy" for Cora).
Hardware Specification Yes All experiments are conducted on a single NVIDIA RTX-3090.
Software Dependencies No The paper does not explicitly list specific software dependencies with version numbers (e.g., Python version, library versions like PyTorch, TensorFlow, or scikit-learn versions).
Experiment Setup Yes For each graph augmentation applied to each GNN, we use the same hyperparameters for fairness. When comparing with other methods, we use the settings from their open-source code and report the average results over 30 runs. (Section 4.1, Semi-supervised Learning) and "Sensitivity analysis of λ indicates that the best performance is achieved when λ = 0.5." (Figure 3 caption) and "Therefore, we choose λ B(2, 2), where B denotes Beta Distribution." (Section 4.4, Ablation Experiment, Sensitivity Analysis of λ)