Sinkhorn Label Allocation: Semi-Supervised Classification via Annealed Self-Training

Authors: Kai Sheng Tai, Peter D Bailis, Gregory Valiant

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate the effectiveness of our algorithm on the CIFAR-10, CIFAR-100, and SVHN datasets in comparison with Fix Match, a state-of-the-art self-training algorithm. Our main baseline for comparison is the Fix Match algorithm (Sohn et al., 2020) since it is a state-of-the-art method for semi-supervised image classification. For each configuration, we report the mean and standard deviation of the error rate across 5 independent trials.
Researcher Affiliation Academia 1Stanford University, Stanford, CA, USA. Correspondence to: Kai Sheng Tai <kst@cs.stanford.edu>.
Pseudocode Yes Algorithm 1 Sinkhorn Label Allocation (SLA) and Algorithm 2 Self-training with Sinkhorn Label Allocation and consistency regularization
Open Source Code Yes Our code is available at https://github.com/stanford-futuredata/ sinkhorn-label-allocation.
Open Datasets Yes We used the CIFAR-10, CIFAR-100 (Krizhevsky, 2009), and SVHN (Netzer et al., 2011) image classification datasets with their standard train/test splits.
Dataset Splits No The paper mentions 'standard train/test splits' for CIFAR-10, CIFAR-100, and SVHN datasets, and details how labeled and unlabeled examples are sampled from the training split. However, it does not explicitly provide information about a separate validation dataset split.
Hardware Specification No The paper does not provide specific hardware details (e.g., exact GPU/CPU models, processor types, or memory amounts) used for running its experiments.
Software Dependencies No The paper describes various methods and architectures (e.g., 'Rand Augment', 'Wide Res Net-28-2 architecture'), but it does not specify version numbers for any software dependencies like programming languages, libraries, or frameworks (e.g., 'Python 3.x', 'PyTorch 1.x').
Experiment Setup Yes We optimized our classifiers using the stochastic Nesterov accelerated gradient method with a momentum parameter of 0.9 and a cosine learning rate schedule given by 0.03 cos(7πt/16T), where t is the current iteration and T = 220 is the total number of iterations. We used a labeled batch size of 64, an unlabeled batch size of 448, weight decay of 5 10 4 on all parameters except biases and batch normalization weights, and unlabeled loss weight λ = 1. For hyperparameters specific to SLA, we used an Sinkhorn regularization parameter of γ = 100 and tolerance parameter ϵt = 0.01 ct 1 for Sinkhorn iteration, where ct is the target column sum at iteration t.