Decoupling Representation and Classifier for Long-Tailed Recognition
Authors: Bingyi Kang, Saining Xie, Marcus Rohrbach, Zhicheng Yan, Albert Gordo, Jiashi Feng, Yannis Kalantidis
ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct extensive experiments and set new state-of-the-art performance on common long-tailed benchmarks like Image Net-LT, Places-LT and i Naturalist, showing that it is possible to outperform carefully designed losses, sampling strategies, even complex modules with memory, by using a straightforward approach that decouples representation and classification. |
| Researcher Affiliation | Collaboration | 1Facebook AI, 2National University of Singapore kang@u.nus.edu,{s9xie,mrf,zyan3,agordo,yannisk}@fb.com,elefjia@nus.edu.sg |
| Pseudocode | No | No structured pseudocode or algorithm blocks were found. |
| Open Source Code | Yes | Our code is available at https://github.com/facebookresearch/classifier-balancing. |
| Open Datasets | Yes | We perform extensive experiments on three large-scale long-tailed datasets, including Places-LT (Liu et al., 2019), Image Net-LT (Liu et al., 2019), and i Naturalist 2018 (i Natrualist, 2018). |
| Dataset Splits | Yes | After training on the long-tailed datasets, we evaluate the models on the corresponding balanced test/validation datasets and report the commonly used top-1 accuracy over all classes, denoted as All. |
| Hardware Specification | No | No specific hardware details (like GPU/CPU models, processor types, or memory amounts) were provided. |
| Software Dependencies | No | We use the Py Torch (Paszke et al., 2017) framework for all experiments. No specific version number for PyTorch or other software dependencies was provided. |
| Experiment Setup | Yes | For all experiements, if not specified, we use SGD optimizer with momentum 0.9, batch size 512, cosine learning rate schedule (Loshchilov & Hutter, 2016) gradually decaying from 0.2 to 0 and image resolution 224 224. In the first representation learning stage, the backbone network is usually trained for 90 epochs. In the second stage, i.e., for retraining a classifier (c RT), we restart the learning rate and train it for 10 epochs while keeping the backbone network fixed. |