Open-Sampling: Exploring Out-of-Distribution data for Re-balancing Long-tailed datasets

Authors: Hongxin Wei, Lue Tao, Renchunzi Xie, Lei Feng, Bo An

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments demonstrate that our proposed method significantly outperforms existing data rebalancing methods and can boost the performance of existing state-of-the-art methods.
Researcher Affiliation Academia 1Nanyang Technological University, Singapore 2Nanjing University of Aeronautics and Astronautics, Nanjing, Jiangsu, China 3Chongqing University, Chongqing, China.
Pseudocode Yes Algorithm 1 Open-sampling
Open Source Code Yes Code and data are publicly available at https://github.com/ hongxin001/logitnorm_ood.
Open Datasets Yes long-tailed CIFAR10/100 (Krizhevsky et al., 2009), Celeb A-5 (Liu et al., 2015b; Kim et al., 2020), and Places-LT (Zhou et al., 2017). ... 300K Random Images (Hendrycks et al., 2019)... The dataset is published on https://github.com/ hendrycks/outlier-exposure.
Dataset Splits Yes The original version of CIFAR-10 and CIFAR-100 contains 50,000 training images and 10,000 validation images of size 32x32... Besides, the validation set and the test set are kept unchanged.
Hardware Specification Yes We conduct all the experiments on NVIDIA Ge Force RTX 3090, and implement all methods with default parameters by Py Torch (Paszke et al., 2019).
Software Dependencies No The paper mentions 'Py Torch (Paszke et al., 2019)' but does not specify a version number for the software.
Experiment Setup Yes For experiments on Long-Tailed CIFAR-10/100 (Krizhevsky et al., 2009) and Celeb A-5 (Liu et al., 2015a), we perform training with Res Net-32 (He et al., 2016) for 200 epochs, using SGD with a momentum of 0.9, and a weight decay of 0.0002. We set the initial learning rate as 0.1, then decay by 0.01 at the 160th epoch and again at the 180th epoch.