Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

From Noisy Prediction to True Label: Noisy Prediction Calibration via Generative Model

Authors: Heesun Bae, Seungjae Shin, Byeonghu Na, Joonho Jang, Kyungwoo Song, Il-Chul Moon

ICML 2022 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Table 1 shows the experimental results on three synthetic datasets with the four types of noisy types of various noisy ratios.
Researcher Affiliation Collaboration 1Industrial and Systems Engineering, Korea Advanced Institute of Science and Technology (KAIST), Daejeon, Republic of Korea 2Department of AI, University of Seoul, Seoul, Republic of Korea 3Summary.AI, Daejeon, Republic of Korea.
Pseudocode Yes Algorithm 1 Instance Dependent Noise Generation Process
Open Source Code Yes The implemented code is available at https://github.com/BaeHeeSun/NPC.
Open Datasets Yes MNIST (Le Cun, 1998), Fashion-MNIST (Xiao et al., 2017), and CIFAR-10 (Krizhevsky et al., 2009).
Dataset Splits Yes MNIST (Le Cun, 1998) and Fashion-MNIST (Xiao et al., 2017) are both 28 × 28 grayscale image datasets with 10 classes, which include 60,000 training samples and 10,000 test samples. ... Splitting a part of noisy training dataset as validation dataset, it only learns network until the validation performance continues to decrease.
Hardware Specification No No specific hardware details (e.g., GPU models, CPU types, or memory specifications) used for running the experiments are mentioned in the paper.
Software Dependencies No The paper states, 'All methods are implemented by Py Torch.' However, it does not specify version numbers for PyTorch or any other software dependencies used in the experiments.
Experiment Setup Yes We train all synthetic datasets with batch size 128 and all real-world datasets with batch size 32. For all datasets, we use Adam optimizer with learning rate of 10^-3 and no learning rate decay is applied.