Decoupling Representation and Classifier for Long-Tailed Recognition

Authors: Bingyi Kang, Saining Xie, Marcus Rohrbach, Zhicheng Yan, Albert Gordo, Jiashi Feng, Yannis Kalantidis

ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conduct extensive experiments and set new state-of-the-art performance on common long-tailed benchmarks like Image Net-LT, Places-LT and i Naturalist, showing that it is possible to outperform carefully designed losses, sampling strategies, even complex modules with memory, by using a straightforward approach that decouples representation and classification.
Researcher Affiliation Collaboration 1Facebook AI, 2National University of Singapore kang@u.nus.edu,{s9xie,mrf,zyan3,agordo,yannisk}@fb.com,elefjia@nus.edu.sg
Pseudocode No No structured pseudocode or algorithm blocks were found.
Open Source Code Yes Our code is available at https://github.com/facebookresearch/classifier-balancing.
Open Datasets Yes We perform extensive experiments on three large-scale long-tailed datasets, including Places-LT (Liu et al., 2019), Image Net-LT (Liu et al., 2019), and i Naturalist 2018 (i Natrualist, 2018).
Dataset Splits Yes After training on the long-tailed datasets, we evaluate the models on the corresponding balanced test/validation datasets and report the commonly used top-1 accuracy over all classes, denoted as All.
Hardware Specification No No specific hardware details (like GPU/CPU models, processor types, or memory amounts) were provided.
Software Dependencies No We use the Py Torch (Paszke et al., 2017) framework for all experiments. No specific version number for PyTorch or other software dependencies was provided.
Experiment Setup Yes For all experiements, if not specified, we use SGD optimizer with momentum 0.9, batch size 512, cosine learning rate schedule (Loshchilov & Hutter, 2016) gradually decaying from 0.2 to 0 and image resolution 224 224. In the first representation learning stage, the backbone network is usually trained for 90 epochs. In the second stage, i.e., for retraining a classifier (c RT), we restart the learning rate and train it for 10 epochs while keeping the backbone network fixed.