P$^2$OT: Progressive Partial Optimal Transport for Deep Imbalanced Clustering
Authors: Chuyu Zhang, Hui Ren, Xuming He
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on various datasets, including a humancurated long-tailed CIFAR100, challenging Image Net-R, and large-scale subsets of fine-grained i Naturalist2018 datasets, demonstrate the superiority of our method. |
| Researcher Affiliation | Academia | Chuyu Zhang1,2, Hui Ren1, Xuming He1,3 1Shanghai Tech University, Shanghai, China 2Lingang Laboratory, Shanghai, China 3Shanghai Engineering Research Center of Intelligent Vision and Imaging, Shanghai, China {zhangchy2,renhui,hexm}@shanghaitech.edu.cn |
| Pseudocode | Yes | Algorithm 1: Scaling Algorithm for P2OT Input: Cost matrix log P, ϵ, λ, ρ, N, K, a large value ι C [ log P, 0N], λ [λ, ..., λ, ι] K 1 K, 1 ρ] , α 1 N 1N b 1K+1, M exp( C/ϵ), f λ λ+ϵ while b not converge do a α Mb b ( β M a) f end Q diag(a)Mdiag(b) return Q[:, : K] |
| Open Source Code | Yes | Code is available at https://github.com/rhfeiyang/PPOT. |
| Open Datasets | Yes | To evaluate our method, we have established a realistic and challenging benchmark, including CIFAR100 (Krizhevsky et al., 2009), Image Net-R (abbreviated as Img Net-R) (Hendrycks et al., 2021) and i Naturalist2018 (Van Horn et al., 2018) datasets. |
| Dataset Splits | No | For Img Net-R... we split 20 images per class as the test set, leaving the remaining data as the training set (R = 13). This only specifies train/test, not validation. The phrase 'We utilize the loss on training sets for clustering head and model selection' does not describe a validation split. |
| Hardware Specification | Yes | This comparison is conducted on i Naure1000 using identical conditions (NVIDIA TITAN RTX, ϵ = 0.1, λ = 1), without employing any acceleration strategies for both. |
| Software Dependencies | No | The paper mentions using 'Vi T-B16' and the 'Adam optimizer' but does not specify version numbers for Python, PyTorch, or other relevant software libraries. |
| Experiment Setup | Yes | Specifically, we train 50 epochs and adopt the Adam optimizer with the learning rate decay from 5e-4 to 5e-6 for all datasets. The batch size is 512. Further details can be found in Appendix F. For hyperparameters, we set λ as 1, ϵ as 0.1, and initial ρ as 0.1. The stop criterion of Alg.1 is when the change of b is less than 1e-6, or the iteration reaches 1000. |