Simplifying Neural Network Training Under Class Imbalance

Authors: Ravid Shwartz-Ziv, Micah Goldblum, Yucen Li, C. Bayan Bruss, Andrew G. Wilson

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Conducting evaluations on real-world datasets, we find that existing methods, which performed well on web-scraped natural image benchmarks on which they were designed, underperform in the real-world setting, whereas our approach is robust.
Researcher Affiliation Collaboration Ravid Shwartz-Ziv New York University ravid.shwartz.ziv@nyu.edu Micah Goldblum New York University goldblum@nyu.edu Yucen Lily Li New York University yucenli@nyu.edu C. Bayan Bruss Capital One bayan.bruss@capitalone.com Andrew Gordon Wilson New York University andrewgw@cims.nyu.edu
Pseudocode No The paper describes mathematical formulas for optimization methods like SAM and Label Smoothing, but does not include structured pseudocode or algorithm blocks.
Open Source Code No The paper does not contain an explicit statement about releasing code or a direct link to a code repository for the methodology described.
Open Datasets Yes To conduct our investigation, we leverage three benchmark image datasets: CIFAR-10 [46], CIFAR-100, and CINIC-10 [15], along with three tabular datasets: Otto Group Product Classification [37], Covertype [7], and Adult datasets [43].
Dataset Splits No The paper mentions tuning hyperparameters 'on a validation split' in Appendix A.9.1 ('Accordingly, we tune each model s hyperparameters for each dataset on a validation split.'), but it does not provide specific details on the size, percentage, or methodology of these validation splits for reproduction.
Hardware Specification Yes All of our models were trained on V100 GPUs.
Software Dependencies No The paper mentions 'Py Torch', 'Py Torch Lightning', and 'Optuna library' but does not specify their version numbers (e.g., 'Our implementation was done in Py Torch, utilizing the Py Torch Lightning library for training.').
Experiment Setup Yes We employ the SGD optimizer with momentum 0.9 and weight decay coefficient 2 10 4. Our models are trained for 300 epochs with cosine annealing and a linear warm-up of the learning rate. The learning rate is initialized at 0.1.