Optimal Transport for Long-Tailed Recognition with Learnable Cost Matrix

Authors: Hanyu Peng, Mingming Sun, Ping Li

ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section, we first conduct experiments comparing our approach versus extant post-hoc correction methods on three data sets, including CIFAR-100-LT (Cao et al., 2019), Image Net-LT (Liu et al., 2019), and i Naturalist (Horn et al., 2018) with varying backbones. Finally, we empirically make a comparison of our algorithm with alternative cutting-edge long-tailed recognition methods.
Researcher Affiliation Industry Hanyu Peng, Mingming Sun, Ping, Li Cognitive Computing Lab Baidu Research No.10 Xibeiwang East Road, Beijing 100193, China 10900 NE 8th St. Bellevue, Washington 98004, USA {penghanyu,sunmingming01,liping11}@baidu.com
Pseudocode Yes Algorithm 1: Solve OT-related algorithm efficiently in the post-hoc correction via Sinkhorn Algorithm.
Open Source Code No The paper mentions implementing experiments in Paddle Paddle but does not provide a statement about releasing its own source code or a link to it.
Open Datasets Yes We take experiments on three data sets including CIFAR-100-LT, Image Net LT, and i Naturalist. We build the imbalanced version of CIFAR-100 by downsampling samples per class following the profile in Liu et al. (2019); Kang et al. (2020) with imbalanced ratios 10, 50, and 100.
Dataset Splits Yes Having a collection of training samples {(xs n, ys n)}Ns n=1, validation samples {(xv n, yv n)}Nv n=1 and test samples {(xt n, yn)t}Nt n=1 for classification with K labels and input x Rd
Hardware Specification Yes Except for OTLM, which was run on an NVidia card (V100), the results come from a 28-core machine (2.20 Ghz Xeon).
Software Dependencies No The paper states, 'All our experiments are implemented in the Paddle Paddle deep learning platform,' but it does not specify version numbers for Paddle Paddle or any other software dependencies.
Experiment Setup Yes The specific implementation details for each data set under the different methods are described below. [...] We apply SGD with batch size 256 and weight decay 0.0005 to train a Res Net-32 (He et al., 2016) model for 200 epochs, we employ the linear warm-up learning rate schedule for the first five epochs. We also set the base learning rate to 0.2 and reduce it at epoch 120 and 160 by a factor of 100.