reproducibilityindex.ai

Locally Adaptive Label Smoothing Improves Predictive Churn

Authors: Dara Bahri, Heinrich Jiang

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	we present several baselines for reducing churn and show that training on soft labels obtained by adaptively smoothing each example s label based on the example s neighboring labels often outperforms the baselines on churn while improving accuracy on a variety of benchmark classiﬁcation tasks and model architectures. We now describe the experimental methodology and results for validating our proposed method.
Researcher Affiliation	Industry	Google Research, Mountain View, USA. Correspondence to: Dara Bahri <dbahri@google.com>.
Pseudocode	Yes	Algorithm 1 Deep k-NN locally adaptive label smoothing
Open Source Code	No	The paper does not provide an explicit statement or link for the open-sourcing of the code for the described methodology.
Open Datasets	Yes	MNIST:, Fashion MNIST:, SVHN:, Celeb A (Liu et al., 2018), UCI Phishing dataset (Dua & Graff, 2017).
Dataset Splits	No	We use the standard train and test splits, which consist of 162770 and 19962 images respectively. (for CelebA) and 7406 train and 3649 test examples (for Phishing). (The paper explicitly mentions train and test splits with specific counts or standard use, but it does not provide explicit details for a separate validation split.)
Hardware Specification	No	The paper does not provide specific hardware details (e.g., GPU/CPU models, memory amounts) used for running its experiments, only mentioning general settings like optimizer and minibatch size.
Software Dependencies	No	The paper mentions using 'Adam optimizer' and specific model architectures like 'Le Net5 CNN' but does not provide specific version numbers for software libraries or dependencies (e.g., Python, TensorFlow, PyTorch versions).
Experiment Setup	Yes	For all datasets we use the Adam optimizer with default learning rate 0.001. We use a minibatch size of 128 throughout. (Also includes specific epoch counts and hidden unit details for each dataset, e.g., 'three-layer MLP with 256 hidden units and Re LU activations for 20 epochs' for MNIST).