P$^2$OT: Progressive Partial Optimal Transport for Deep Imbalanced Clustering

Authors: Chuyu Zhang, Hui Ren, Xuming He

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on various datasets, including a humancurated long-tailed CIFAR100, challenging Image Net-R, and large-scale subsets of fine-grained i Naturalist2018 datasets, demonstrate the superiority of our method.
Researcher Affiliation Academia Chuyu Zhang1,2, Hui Ren1, Xuming He1,3 1Shanghai Tech University, Shanghai, China 2Lingang Laboratory, Shanghai, China 3Shanghai Engineering Research Center of Intelligent Vision and Imaging, Shanghai, China {zhangchy2,renhui,hexm}@shanghaitech.edu.cn
Pseudocode Yes Algorithm 1: Scaling Algorithm for P2OT Input: Cost matrix log P, ϵ, λ, ρ, N, K, a large value ι C [ log P, 0N], λ [λ, ..., λ, ι] K 1 K, 1 ρ] , α 1 N 1N b 1K+1, M exp( C/ϵ), f λ λ+ϵ while b not converge do a α Mb b ( β M a) f end Q diag(a)Mdiag(b) return Q[:, : K]
Open Source Code Yes Code is available at https://github.com/rhfeiyang/PPOT.
Open Datasets Yes To evaluate our method, we have established a realistic and challenging benchmark, including CIFAR100 (Krizhevsky et al., 2009), Image Net-R (abbreviated as Img Net-R) (Hendrycks et al., 2021) and i Naturalist2018 (Van Horn et al., 2018) datasets.
Dataset Splits No For Img Net-R... we split 20 images per class as the test set, leaving the remaining data as the training set (R = 13). This only specifies train/test, not validation. The phrase 'We utilize the loss on training sets for clustering head and model selection' does not describe a validation split.
Hardware Specification Yes This comparison is conducted on i Naure1000 using identical conditions (NVIDIA TITAN RTX, ϵ = 0.1, λ = 1), without employing any acceleration strategies for both.
Software Dependencies No The paper mentions using 'Vi T-B16' and the 'Adam optimizer' but does not specify version numbers for Python, PyTorch, or other relevant software libraries.
Experiment Setup Yes Specifically, we train 50 epochs and adopt the Adam optimizer with the learning rate decay from 5e-4 to 5e-6 for all datasets. The batch size is 512. Further details can be found in Appendix F. For hyperparameters, we set λ as 1, ϵ as 0.1, and initial ρ as 0.1. The stop criterion of Alg.1 is when the change of b is less than 1e-6, or the iteration reaches 1000.