Simplifying Neural Network Training Under Class Imbalance
Authors: Ravid Shwartz-Ziv, Micah Goldblum, Yucen Li, C. Bayan Bruss, Andrew G. Wilson
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Conducting evaluations on real-world datasets, we find that existing methods, which performed well on web-scraped natural image benchmarks on which they were designed, underperform in the real-world setting, whereas our approach is robust. |
| Researcher Affiliation | Collaboration | Ravid Shwartz-Ziv New York University ravid.shwartz.ziv@nyu.edu Micah Goldblum New York University goldblum@nyu.edu Yucen Lily Li New York University yucenli@nyu.edu C. Bayan Bruss Capital One bayan.bruss@capitalone.com Andrew Gordon Wilson New York University andrewgw@cims.nyu.edu |
| Pseudocode | No | The paper describes mathematical formulas for optimization methods like SAM and Label Smoothing, but does not include structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not contain an explicit statement about releasing code or a direct link to a code repository for the methodology described. |
| Open Datasets | Yes | To conduct our investigation, we leverage three benchmark image datasets: CIFAR-10 [46], CIFAR-100, and CINIC-10 [15], along with three tabular datasets: Otto Group Product Classification [37], Covertype [7], and Adult datasets [43]. |
| Dataset Splits | No | The paper mentions tuning hyperparameters 'on a validation split' in Appendix A.9.1 ('Accordingly, we tune each model s hyperparameters for each dataset on a validation split.'), but it does not provide specific details on the size, percentage, or methodology of these validation splits for reproduction. |
| Hardware Specification | Yes | All of our models were trained on V100 GPUs. |
| Software Dependencies | No | The paper mentions 'Py Torch', 'Py Torch Lightning', and 'Optuna library' but does not specify their version numbers (e.g., 'Our implementation was done in Py Torch, utilizing the Py Torch Lightning library for training.'). |
| Experiment Setup | Yes | We employ the SGD optimizer with momentum 0.9 and weight decay coefficient 2 10 4. Our models are trained for 300 epochs with cosine annealing and a linear warm-up of the learning rate. The learning rate is initialized at 0.1. |