FixMatch: Simplifying Semi-Supervised Learning with Consistency and Confidence

Authors: Kihyuk Sohn, David Berthelot, Nicholas Carlini, Zizhao Zhang, Han Zhang, Colin A. Raffel, Ekin Dogus Cubuk, Alexey Kurakin, Chun-Liang Li

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Despite its simplicity, we show that Fix Match achieves state-of-the-art performance across a variety of standard semi-supervised learning benchmarks, including 94.93% accuracy on CIFAR-10 with 250 labels and 88.61% accuracy with 40 just 4 labels per class. We carry out an extensive ablation study to tease apart the experimental factors that are most important to Fix Match s success.
Researcher Affiliation Industry Kihyuk Sohn David Berthelot Chun-Liang Li Zizhao Zhang Nicholas Carlini Ekin D. Cubuk Alex Kurakin Han Zhang Colin Raffel Google Research {kihyuks,dberth,chunliang,zizhaoz,ncarlini, cubuk,kurakin,zhanghan,craffel}@google.com
Pseudocode Yes We present a complete algorithm for Fix Match in algorithm 1 of the supplementary material.
Open Source Code Yes The code is available at https://github.com/google-research/fixmatch.
Open Datasets Yes We evaluate the efficacy of Fix Match on several SSL image classification benchmarks. Specifically, we perform experiments with varying amounts of labeled data and augmentation strategies on CIFAR10/100 [23], SVHN [35], STL-10 [9], and Image Net [13], following standard SSL evaluation protocols [36, 4, 3].
Dataset Splits Yes Following [54], we use 10% of the training data as labeled and treat the rest as unlabeled examples. We compute the mean and variance of accuracy when training on 5 different folds of labeled data.
Hardware Specification No The paper mentions network architectures (e.g., 'Wide Res Net-28-2', 'Res Net-50') used for experiments but does not provide specific details about the hardware setup, such as GPU/CPU models, memory, or specific cloud computing instance types used for training or evaluation.
Software Dependencies No The paper mentions the use of 'standard SGD with momentum' and 'Adam optimizer' and various augmentation methods (Rand Augment, CTAugment, Cutout). However, it does not provide specific version numbers for any software libraries, frameworks (e.g., TensorFlow, PyTorch), or programming languages used for implementation.
Experiment Setup Yes We use an identical set of hyperparameters (λu = 1, η = 0.03, β = 0.9, τ = 0.95, µ = 7, B = 64, K = 220) across all amounts of labeled examples and datasets other than Image Net. A complete list of hyperparameters is reported in appendix B.1. For a learning rate schedule, we use a cosine learning rate decay [28] which sets the learning rate to η cos 7πk / 16K where η is the initial learning rate, k is the current training step, and K is the total number of training steps.