reproducibilityindex.ai

FixMatch: Simplifying Semi-Supervised Learning with Consistency and Confidence

Authors: Kihyuk Sohn, David Berthelot, Nicholas Carlini, Zizhao Zhang, Han Zhang, Colin A. Raffel, Ekin Dogus Cubuk, Alexey Kurakin, Chun-Liang Li

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Despite its simplicity, we show that Fix Match achieves state-of-the-art performance across a variety of standard semi-supervised learning benchmarks, including 94.93% accuracy on CIFAR-10 with 250 labels and 88.61% accuracy with 40 just 4 labels per class. We carry out an extensive ablation study to tease apart the experimental factors that are most important to Fix Match s success.
Researcher Affiliation	Industry	Kihyuk Sohn David Berthelot Chun-Liang Li Zizhao Zhang Nicholas Carlini Ekin D. Cubuk Alex Kurakin Han Zhang Colin Raffel Google Research {kihyuks,dberth,chunliang,zizhaoz,ncarlini, cubuk,kurakin,zhanghan,craffel}@google.com
Pseudocode	Yes	We present a complete algorithm for Fix Match in algorithm 1 of the supplementary material.
Open Source Code	Yes	The code is available at https://github.com/google-research/fixmatch.
Open Datasets	Yes	We evaluate the efﬁcacy of Fix Match on several SSL image classiﬁcation benchmarks. Speciﬁcally, we perform experiments with varying amounts of labeled data and augmentation strategies on CIFAR10/100 [23], SVHN [35], STL-10 [9], and Image Net [13], following standard SSL evaluation protocols [36, 4, 3].
Dataset Splits	Yes	Following [54], we use 10% of the training data as labeled and treat the rest as unlabeled examples. We compute the mean and variance of accuracy when training on 5 different folds of labeled data.
Hardware Specification	No	The paper mentions network architectures (e.g., 'Wide Res Net-28-2', 'Res Net-50') used for experiments but does not provide specific details about the hardware setup, such as GPU/CPU models, memory, or specific cloud computing instance types used for training or evaluation.
Software Dependencies	No	The paper mentions the use of 'standard SGD with momentum' and 'Adam optimizer' and various augmentation methods (Rand Augment, CTAugment, Cutout). However, it does not provide specific version numbers for any software libraries, frameworks (e.g., TensorFlow, PyTorch), or programming languages used for implementation.
Experiment Setup	Yes	We use an identical set of hyperparameters (λu = 1, η = 0.03, β = 0.9, τ = 0.95, µ = 7, B = 64, K = 220) across all amounts of labeled examples and datasets other than Image Net. A complete list of hyperparameters is reported in appendix B.1. For a learning rate schedule, we use a cosine learning rate decay [28] which sets the learning rate to η cos 7πk / 16K where η is the initial learning rate, k is the current training step, and K is the total number of training steps.