Decoupled Training for Long-Tailed Classification With Stochastic Representations
Authors: Giung Nam, Sunguk Jang, Juho Lee
ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on CIFAR10/100-LT, Image Net-LT, and i Naturalist-2018 benchmarks show that our proposed method improves upon previous methods both in terms of prediction accuracy and uncertainty estimation. |
| Researcher Affiliation | Collaboration | 1Korea Advanced Institute of Science and Technology (KAIST), 2AITRICS |
| Pseudocode | Yes | Algorithm 1 Decoupled training w/ SWA + SRepr (ours). |
| Open Source Code | Yes | Code is available at https://github.com/cs-giung/long-tailed-srepr. Our implementations are built on JAX (Bradbury et al., 2018), Flax (Heek et al., 2020), and Optax (Hessel et al., 2020). |
| Open Datasets | Yes | Using CIFAR10/100-LT (Cao et al., 2019), Image Net-LT (Liu et al., 2019), and i Naturalist-2018 (Van Horn et al., 2018) benchmarks for long-tailed image classification, we empirically validate that our proposed method improves upon previous approaches both in terms of prediction accuracy and uncertainty estimation. |
| Dataset Splits | Yes | Image Net-LT. It consists of 115,846 train examples, 20,000 validation examples and 50,000 test examples from 1,000 classes. |
| Hardware Specification | Yes | For Image Net-LT and i Naturalist-2018, we conduct all experiments on 8 TPUv3 cores, supported by TPU Research Cloud. |
| Software Dependencies | No | Our implementations are built on JAX (Bradbury et al., 2018), Flax (Heek et al., 2020), and Optax (Hessel et al., 2020). |
| Experiment Setup | Yes | Throughout the main experiments on Image Net-LT and i Naturalist-2018, we use an SGD optimizer with batch size 256, Nesterov momentum 0.9, and a single-cycle cosine decaying learning rate starting from the base learning rate of 0.1. Unless specified, the optimization for the representation learning stage terminates after 100 training epochs for Image Net-LT and 200 training epochs for i Naturalist-2018. For the classifier re-training, we introduce an additional 10% training epochs to re-train the classifier. (...) Throughout the paper, we apply λwd = 0.0003 for Image Net LT, λwd = 0.0001 for i Naturalist-2018, and λwd = 0.0005 for CIFAR10/100-LT. (...) Throughout the paper, we use ηSWA = 0.010 for Image Net-LT, ηSWA = 0.005 for i Naturalist-2018, and ηSWA = 0.1 for CIFAR10/100-LT. |