reproducibilityindex.ai

Collaborative Refining for Learning from Inaccurate Labels

Authors: BIN HAN, Yi-Xuan Sun, Ya-Lin Zhang, Libang Zhang, Haoran Hu, Longfei Li, Jun Zhou, Guo Ye, HUIMEI HE

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments are conducted on benchmark and real-world datasets, which demonstrate the superiority of the proposed framework.
Researcher Affiliation	Industry	Bin Han, Yi-Xuan Sun, Ya-Lin Zhang, Libang Zhang, Haoran Hu Longfei Li, Jun Zhou , Guo Ye, Huimei He Ant Group {binlin.hb, xuan.syx, lyn.zyl, libang.zlb, hhr327996, longyao.llf, jun.zhoujun, yeguo.yg, huimei.hhm}@antgroup.com
Pseudocode	Yes	Algorithm 1 Collaborative Refining for Learning from inaccurate labels (CRL).
Open Source Code	No	We will consider open-sourcing the code after the paper is accepted.
Open Datasets	Yes	Benchmark datasets. All the methods are evaluated on 13 benchmark datasets with two kinds of noise...Real-world datasets. Experiments are also conducted on two real-world datasets: CIFAR-10N and Sentiment. Both datasets were published on Amazon Mechanical Turk for annotation. Details of these datasets and labels can be found in Appendix B. (Appendix B then lists sources and citations, e.g., 'Diabetes dataset is sampled from a dataset on Kaggle1', 'Sentiment: This dataset is the original one in the website2').
Dataset Splits	Yes	For benchmark datasets, 70% of each dataset is utilized for training, 5% for validation, and 25% for testing.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running the experiments. It only describes the model architecture and general training setup.
Software Dependencies	No	The paper describes the model architecture and training parameters but does not specify version numbers for programming languages, libraries, or other software dependencies.
Experiment Setup	Yes	For our method...For RUS, we set the proportion of selected samples p = 0.8, and take the 5th epoch and the latest epoch during training as the selected epochs in Eq.( 8). In practice, LRD-generated labels are held constant after 5 training epochs to mitigate the over-fitting issue. For all of the methods, experiments are conducted with 0.001 learning rate, 100 training epochs, and 256 batch size on MLP with hidden dimension 128 for a fair comparison.