Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Confidence-based Reliable Learning under Dual Noises
Authors: Peng Cui, Yang Yue, Zhijie Deng, Jun Zhu
NeurIPS 2022 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results on various challenging synthetic and real-world noisy datasets verify that the proposed method can outperform competing baselines in the aspect of classification performance. |
| Researcher Affiliation | Collaboration | Peng Cui1 3, Yang Yue1, Zhijie Deng1 2 , Jun Zhu1 1 Dept. of Comp. Sci. & Tech., Institute for AI, BNRist Center, Tsinghua-Bosch Joint ML Center, THBI Lab, Tsinghua University, Beijing, 100084 China 2 Qing Yuan Research Institute, Shanghai Jiao Tong University 3 Real AI |
| Pseudocode | Yes | Algorithm 1: Training DNNs under (x,y)-noise |
| Open Source Code | Yes | 3.a) Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [Yes] |
| Open Datasets | Yes | The proposed method is first evaluated on two benchmark datasets with synthetic noise: CIFAR-100 [24] and Tiny Image Net [24] (the subset of Image Net[9])... Moreover, we validate the effectiveness of the proposed method under more challenging real-world noise on Web Vision [28]. |
| Dataset Splits | Yes | Table 1 presents the results of all methods on CIFAR-100 and Tiny Image Net with different rates of x-noise and y-noise. ... Table 2 lists the experimental results. As we can see, the proposed method significantly outperforms other baselines not only on the Web Vision validation set but also on the ILSVRC12 validation set [9]. |
| Hardware Specification | No | The paper mentions support from the 'High Performance Computing Center, Tsinghua University' in the Acknowledgement section, but this is a general reference and does not specify any particular GPU models, CPU types, or other detailed hardware specifications used for the experiments. |
| Software Dependencies | No | The paper states that 'SGD is used to optimize the network' and that 'The deep ensemble we used consists of 5 Res Net18', but it does not specify any software versions for libraries (e.g., TensorFlow, PyTorch, scikit-learn) or programming languages. |
| Experiment Setup | Yes | SGD is used to optimize the network with a batch size of 256. More details can be found in Appendix B. |