Long-tailed Recognition by Routing Diverse Distribution-Aware Experts
Authors: Xudong Wang, Long Lian, Zhongqi Miao, Ziwei Liu, Stella Yu
ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We propose a new long-tailed classifier called Rout Ing Diverse Experts (RIDE). It reduces the model variance with multiple experts, reduces the model bias with a distribution-aware diversity loss, reduces the computational cost with a dynamic expert routing module. RIDE outperforms the state-of-the-art by 5% to 7% on CIFAR100-LT, Image Net-LT and i Naturalist 2018 benchmarks. It is also a universal framework that is applicable to various backbone networks, long-tailed algorithms, and training mechanisms for consistent performance gains. Our code is available at: https://github.com/frank-xwang/RIDE-Long Tail Recognition. |
| Researcher Affiliation | Academia | Xudong Wang1, Long Lian1, Zhongqi Miao1, Ziwei Liu2, Stella X. Yu1 1UC Berkeley / ICSI, 2Nanyang Technological University {xdwang,longlian,zhongqi.miao,stellayu}@berkeley.edu ziwei.liu@ntu.edu.sg |
| Pseudocode | No | The paper describes the method using figures and text but does not include a formal pseudocode or algorithm block. |
| Open Source Code | Yes | Our code is available at: https://github.com/frank-xwang/RIDE-Long Tail Recognition. |
| Open Datasets | Yes | 1. CIFAR100-LT (Cao et al., 2019): CIFAR100 is sampled by class per an exponential decay across classes. We choose imbalance factor 100 and Res Net-32 (He et al., 2016) backbone. 2. Image Net-LT (Liu et al., 2019): Multiple backbone networks are experimented on Image Net LT... 3. i Naturalist 2018 (Van Horn et al., 2018): It is a naturally imbalanced fine-grained dataset with 8,142 categories. |
| Dataset Splits | Yes | The original version of CIFAR-100 contains 50,000 images on training set and 10,000 images on validation set with 100 categories. |
| Hardware Specification | Yes | All backbone networks are trained with a batch size of 256 on 8 RTX 2080Ti GPUs for 100 epochs |
| Software Dependencies | No | The paper mentions optimizers (SGD) and backbone networks (ResNet-32) but does not specify software dependencies with version numbers (e.g., Python, PyTorch/TensorFlow versions). |
| Experiment Setup | Yes | CIFAR100-LT is trained for 200 epochs with standard data augmentations (He et al., 2016) and a batch size of 128 on one RTX 2080Ti GPU. The learning rate is initialized as 0.1 and decayed by 0.01 at epoch 120 and 160 respectively. |