Collaborative Refining for Learning from Inaccurate Labels

Authors: BIN HAN, Yi-Xuan Sun, Ya-Lin Zhang, Libang Zhang, Haoran Hu, Longfei Li, Jun Zhou, Guo Ye, HUIMEI HE

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments are conducted on benchmark and real-world datasets, which demonstrate the superiority of the proposed framework.
Researcher Affiliation Industry Bin Han, Yi-Xuan Sun, Ya-Lin Zhang, Libang Zhang, Haoran Hu Longfei Li, Jun Zhou , Guo Ye, Huimei He Ant Group {binlin.hb, xuan.syx, lyn.zyl, libang.zlb, hhr327996, longyao.llf, jun.zhoujun, yeguo.yg, huimei.hhm}@antgroup.com
Pseudocode Yes Algorithm 1 Collaborative Refining for Learning from inaccurate labels (CRL).
Open Source Code No We will consider open-sourcing the code after the paper is accepted.
Open Datasets Yes Benchmark datasets. All the methods are evaluated on 13 benchmark datasets with two kinds of noise...Real-world datasets. Experiments are also conducted on two real-world datasets: CIFAR-10N and Sentiment. Both datasets were published on Amazon Mechanical Turk for annotation. Details of these datasets and labels can be found in Appendix B. (Appendix B then lists sources and citations, e.g., 'Diabetes dataset is sampled from a dataset on Kaggle1', 'Sentiment: This dataset is the original one in the website2').
Dataset Splits Yes For benchmark datasets, 70% of each dataset is utilized for training, 5% for validation, and 25% for testing.
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running the experiments. It only describes the model architecture and general training setup.
Software Dependencies No The paper describes the model architecture and training parameters but does not specify version numbers for programming languages, libraries, or other software dependencies.
Experiment Setup Yes For our method...For RUS, we set the proportion of selected samples p = 0.8, and take the 5th epoch and the latest epoch during training as the selected epochs in Eq.( 8). In practice, LRD-generated labels are held constant after 5 training epochs to mitigate the over-fitting issue. For all of the methods, experiments are conducted with 0.001 learning rate, 100 training epochs, and 256 batch size on MLP with hidden dimension 128 for a fair comparison.