SoLar: Sinkhorn Label Refinery for Imbalanced Partial-Label Learning

Authors: Haobo Wang, Mingxuan Xia, Yixuan Li, Yuren Mao, Lei Feng, Gang Chen, Junbo Zhao

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Through extensive experiments, So Lar exhibits substantially superior results on standardized benchmarks compared to the previous state-of-the-art PLL methods.
Researcher Affiliation Academia 1Key Lab of Intelligent Computing based Big Data of Zhejiang Province, Zhejiang University 2School of Software Technology, Zhejiang University 3Department of Computer Sciences, University of Wisconsin-Madison 4College of Computer Science, Chongqing University 5Center for Advanced Intelligence Project, RIKEN
Pseudocode Yes Algorithm 1: Pseudo-code of So Lar.
Open Source Code Yes Code and data are available at: https://github.com/hbzju/So Lar.
Open Datasets Yes First, we evaluate So Lar on two long-tailed datasets CIFAR10-LT and CIFAR100-LT introduced in [20, 21]. ... we conduct experiments on the large-scale SUN397 dataset [27]
Dataset Splits Yes The training images are randomly removed class-wise to follow a pre-defined imbalance ratio γ = n1 n L , where nj is the image number of the j-th class. ... For the SUN397 dataset, we hold out 50 samples per class for testing
Hardware Specification Yes We train all models on 8 NVIDIA A100 GPUs.
Software Dependencies No The paper mentions using ResNet, SGD optimizer, consistency regularization [19], and Mixup [25] techniques, but does not provide specific version numbers for any software libraries or dependencies (e.g., PyTorch version, TensorFlow version, etc.).
Experiment Setup Yes The model is trained for 1000 epochs using a standard SGD optimizer with a momentum of 0.9. The initial learning rate is set as 0.01, and decays by the cosine learning rate schedule. The batch size is 256. ... For our Sinkhorn-Knopp algorithm, we fix the smoothing regularization parameter as λ = 3 and the length of the queue for acceleration as 64 times batch size. The moving-average parameter µ for class prior estimation is set as 0.1/0.05 in the first stage and fixed as 0.01 later. For class-wise reliable sample selection, we linearly ramp up ρ from 0.2 to 0.5/0.6 in the first 50 epochs and fix the high-confidence selection threshold τ as 0.99.