Noisy Feature Mixup

Authors: Soon Hoe Lim, N. Benjamin Erichson, Francisco Utrera, Winnie Xu, Michael W. Mahoney

ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We provide empirical results in support of our theoretical findings, showing that NFM improves robustness with respect to various forms of data perturbation across a wide range of state-of-the-art architectures on computer vision benchmark tasks. In the Supplementary Materials (SM), we provide proofs for our theorems along with additional theoretical and empirical results to gain more insights into NFM.
Researcher Affiliation Academia Soon Hoe Lim Nordita, KTH and Stockholm University soon.hoe.lim@su.edu N. Benjamin Erichson* University of Pittsburgh erichson@pitt.edu Francisco Utrera University of Pittsburgh and ICSI utrerf@berkeley.edu Winnie Xu University of Toronto winniexu@cs.toronto.edu Michael W. Mahoney ICSI and UC Berkeley mmahoney@stat.berkeley.edu
Pseudocode No The paper describes the training steps for NFM in a numbered list (1-6) within Section 3, but this is presented as descriptive text rather than a formally structured pseudocode or algorithm block.
Open Source Code Yes The codes that can be used to reproduce the empirical results, as well as description of the data processing steps, presented in this paper are available as a zip file in Supplementary Material at Open Review.net. The codes are also available at https://github.com/erichson/NFM.
Open Datasets Yes We evaluate the average performance of NFM with different model architectures on CIFAR10 (Krizhevsky, 2009), CIFAR-100 (Krizhevsky, 2009), Image Net (Deng et al., 2009), and CIFAR10c (Hendrycks & Dietterich, 2019).
Dataset Splits No The paper does not explicitly provide percentages or counts for training/validation/test splits, nor does it refer to a standard validation split. It mentions training and testing but not a distinct validation set in terms of data splitting.
Hardware Specification No The paper does not explicitly describe the hardware used for running its experiments, such as specific GPU or CPU models, or cloud computing instance types.
Software Dependencies No The paper mentions optimizers like 'Adam' and implicitly machine learning frameworks (e.g., PyTorch, given the context of deep learning), but it does not provide specific version numbers for any software dependencies.
Experiment Setup Yes All hyperparameters are consistent with those of the baseline model across the ablation experiments. In the models trained on the different data augmentation schemes, we keep α fixed, i.e., the parameter defining Beta(α, α), from which the λ parameter controlling the convex combination between data point pairs is sampled. Across all models trained with NFM, we control the level of noise injections by fixing the additive noise level to σadd = 0.4 and multiplicative noise to σmult = 0.2. To demonstrate the significant improvements on robustness upon the introduction of these small input perturbations, we show a second model ( * ) that was injected with higher noise levels (i.e., σadd = 1.0, σmult = 0.5). See SM (Section F.5) for further details and comparisons against NFM models trained on various other levels of noise injections.