Partial and Asymmetric Contrastive Learning for Out-of-Distribution Detection in Long-Tailed Recognition
Authors: Haotao Wang, Aston Zhang, Yi Zhu, Shuai Zheng, Mu Li, Alex J Smola, Zhangyang Wang
ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our method outperforms previous state-of-the-art method by 1.29%, 1.45%, 0.69% anomaly detection false positive rate (FPR) and 3.24%, 4.06%, 7.89% in-distribution classification accuracy on CIFAR10-LT, CIFAR100-LT, and Image Net-LT, respectively. |
| Researcher Affiliation | Collaboration | 1University of Texas at Austin, Austin, USA 2Amazon Web Services, Santa Clara, USA. |
| Pseudocode | Yes | Algorithm 1 Partial and Asymmetric Supervised Contrastive Learning (PASCL) |
| Open Source Code | Yes | Code and pre-trained models are available at https: //github.com/amazon-research/long-tailed-ood-detection. |
| Open Datasets | Yes | We use three popular long-tailed image classification datasets, CIFAR10-LT, CIFAR100-LT (Cao et al., 2019), and Image Net-LT (Liu et al., 2019), as the in-distribution training data (i.e., Din). |
| Dataset Splits | Yes | We use the original CIFAR10 and CIFAR100 test sets and the Image Net validation set as the in-distribution test sets (i.e., Dtest in ). |
| Hardware Specification | No | The paper does not explicitly mention the specific hardware used for running the experiments (e.g., GPU/CPU models, memory specifications). |
| Software Dependencies | No | The paper mentions "Adam (Kingma & Ba, 2014) optimizer" and "cosine annealing learning rate scheduler (Loshchilov & Hutter, 2016)", but it does not specify version numbers for broader software components or libraries (e.g., Python, PyTorch, CUDA versions). |
| Experiment Setup | Yes | For experiments on CIFAR10-LT and CIFAR100-LT, we train the main branch (i.e., stage 1 in Algorithm 1) for 200 epochs using Adam (Kingma & Ba, 2014) optimizer with initial learning rate 1 10 3 and batch size 256. We decay the learning rate to 0 using a cosine annealing learning rate scheduler (Loshchilov & Hutter, 2016). For auxiliary branch finetuning (i.e., stage 2 in Algorithm 1), we finetune the auxiliary branch for 3 epochs using Adam optimizer with initial learning rate 5 10 4. Other hyper-parameters are the same as in main branch training. For experiments on Image Net-LT, we follow the settings in (Wang et al., 2021b). Specifically, we train the main branch for 100 epochs using SGD optimizer with initial learning rate 0.1 and batch size 256. We decay the learning rate by a factor of 10 at epoch 60 and 80. For auxiliary branch finetuning , we finetune the auxiliary branch for 3 epochs using SGD optimizer with initial learning rate 0.01, which is decayed by a factor of 10 after each finetune epoch. On all datasets, we set τ = 0.1 following (Khosla et al., 2020), λ1 = 0.5 following (Hendrycks et al., 2019), and empirically set λ2 = 0.1 for PASCL. |