Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Sinkhorn Label Allocation: Semi-Supervised Classification via Annealed Self-Training
Authors: Kai Sheng Tai, Peter D Bailis, Gregory Valiant
ICML 2021 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate the effectiveness of our algorithm on the CIFAR-10, CIFAR-100, and SVHN datasets in comparison with Fix Match, a state-of-the-art self-training algorithm. Our main baseline for comparison is the Fix Match algorithm (Sohn et al., 2020) since it is a state-of-the-art method for semi-supervised image classification. For each configuration, we report the mean and standard deviation of the error rate across 5 independent trials. |
| Researcher Affiliation | Academia | 1Stanford University, Stanford, CA, USA. Correspondence to: Kai Sheng Tai <EMAIL>. |
| Pseudocode | Yes | Algorithm 1 Sinkhorn Label Allocation (SLA) and Algorithm 2 Self-training with Sinkhorn Label Allocation and consistency regularization |
| Open Source Code | Yes | Our code is available at https://github.com/stanford-futuredata/ sinkhorn-label-allocation. |
| Open Datasets | Yes | We used the CIFAR-10, CIFAR-100 (Krizhevsky, 2009), and SVHN (Netzer et al., 2011) image classification datasets with their standard train/test splits. |
| Dataset Splits | No | The paper mentions 'standard train/test splits' for CIFAR-10, CIFAR-100, and SVHN datasets, and details how labeled and unlabeled examples are sampled from the training split. However, it does not explicitly provide information about a separate validation dataset split. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., exact GPU/CPU models, processor types, or memory amounts) used for running its experiments. |
| Software Dependencies | No | The paper describes various methods and architectures (e.g., 'Rand Augment', 'Wide Res Net-28-2 architecture'), but it does not specify version numbers for any software dependencies like programming languages, libraries, or frameworks (e.g., 'Python 3.x', 'PyTorch 1.x'). |
| Experiment Setup | Yes | We optimized our classifiers using the stochastic Nesterov accelerated gradient method with a momentum parameter of 0.9 and a cosine learning rate schedule given by 0.03 cos(7πt/16T), where t is the current iteration and T = 220 is the total number of iterations. We used a labeled batch size of 64, an unlabeled batch size of 448, weight decay of 5 10 4 on all parameters except biases and batch normalization weights, and unlabeled loss weight λ = 1. For hyperparameters specific to SLA, we used an Sinkhorn regularization parameter of γ = 100 and tolerance parameter ϵt = 0.01 ct 1 for Sinkhorn iteration, where ct is the target column sum at iteration t. |