reproducibilityindex.ai

Can Less be More? When Increasing-to-Balancing Label Noise Rates Considered Beneficial

Authors: Yang Liu, Jialu Wang

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We formally establish the effectiveness of the proposed solution and demonstrate it with extensive experiments. In order to verify the power of our increasing-to-balancing method, we conduct extensive experiments on both unconstrained learning and constrained learning settings.
Researcher Affiliation	Academia	Yang Liu Computer Science and Engineering University of California, Santa Cruz Santa Cruz, CA 99064 yangliu@ucsc.edu Jialu Wang Computer Science and Engineering University of California, Santa Cruz Santa Cruz, CA 99064 faldict@ucsc.edu
Pseudocode	Yes	We show pseudocode for an implementation of estimating PA in Figure ?? in Appendix ??. We summarize NOISE+ in Algorithm 1. Figure 3: Pseudocode for Flip. Flip takes the dataset and a small probability ϵ as input, and only ﬂips positive examples with probability ϵ.
Open Source Code	Yes	The code for reproducing the experimental results is available at https://github.com/UCSC-REAL/Can Less Be More.
Open Datasets	Yes	The datasets include: the UCI Adult Income dataset [9], the Compas recidivism dataset [2], Fairface [15] face attribute dataset, and CIFAR 10 [16] dataset.
Dataset Splits	No	The paper mentions training and testing sets, but it does not provide specific details on validation splits, such as percentages, sample counts, or references to predefined validation sets.
Hardware Specification	No	The paper does not provide specific hardware details such as GPU or CPU models, memory specifications, or types of computing instances used for running the experiments. It only mentions implementing models and training them.
Software Dependencies	No	The paper mentions various models and losses (e.g., one-layer perceptron, cross entropy, peer loss, MLP, Res Net-50, vision transformer) but does not specify any software libraries or their version numbers (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup	No	The paper mentions types of models used (e.g., one-layer perceptron, MLP, ResNet-50) and that experiments were repeated 5 runs with different random seeds. However, it lacks specific details on hyperparameters such as learning rate, batch size, number of epochs, or optimizer settings, which are crucial for full reproducibility. It states 'Without a careful tuning of training parameters,' implying these details are not provided.