Escaping Saddle Points for Effective Generalization on Class-Imbalanced Data
Authors: Harsh Rangwani, Sumukh K Aithal, Mayank Mishra, Venkatesh Babu R
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Using SAM results in a 6.2% increase in accuracy on the minority classes over the state-of-the-art Vector Scaling Loss, leading to an overall average increase of 4% across imbalanced datasets. The code is available at https://github.com/val-iisc/Saddle-Long Tail. 5 Experiments |
| Researcher Affiliation | Academia | Harsh Rangwani Sumukh K Aithal Mayank Mishra R. Venkatesh Babu Video Analytics Lab, Indian Institute of Science, Bengaluru, India {harshr@iisc.ac.in, sumukhaithal6@gmail.com, mayankmishra@iisc.ac.in, venky@iisc.ac.in} |
| Pseudocode | Yes | Algorithm for DRW+SAM is defined in App. G. |
| Open Source Code | Yes | The code is available at https://github.com/val-iisc/Saddle-Long Tail. |
| Open Datasets | Yes | We report our results on four long-tailed datasets: CIFAR-10 LT [9], CIFAR-100 LT [9], Image Net-LT [34], and i Naturalist 2018 [44]. a) CIFAR-10 LT and CIFAR-100 LT: The original CIFAR-10 and CIFAR-100 datasets consist of 50,000 training images and 10,000 validation images, spread across 10 and 100 classes, respectively. |
| Dataset Splits | Yes | The original CIFAR-10 and CIFAR-100 datasets consist of 50,000 training images and 10,000 validation images, spread across 10 and 100 classes, respectively. |
| Hardware Specification | No | The main body of the paper does not explicitly state specific hardware details such as GPU models, CPU types, or memory amounts. It states 'Added in the Appendix' regarding compute resources, but the appendix content is not provided in the given text. |
| Software Dependencies | No | The paper mentions software components like 'Res Net-32 architecture' and 'SGD', but does not provide specific version numbers for any libraries, frameworks, or programming languages used (e.g., Python, PyTorch, TensorFlow versions). |
| Experiment Setup | Yes | We follow the hyperparameters and setup as in Cao et al. [9] for CIFAR-10 LT and CIFAR-100 LT datasets. We train a Res Net-32 architecture as the backbone and SGD with a momentum of 0.9 as the base optimizer for 200 epochs. A multi-step learning rate schedule is used, which drops the learning rate by 0.01 and 0.0001 at the 160th and 180th epoch, respectively. For training with SAM, we set a constant ρ value of either 0.5 or 0.8 for most methods. |