ReMixMatch: Semi-Supervised Learning with Distribution Matching and Augmentation Anchoring
Authors: David Berthelot, Nicholas Carlini, Ekin D. Cubuk, Alex Kurakin, Kihyuk Sohn, Han Zhang, Colin Raffel
ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We improve the recently-proposed Mix Match semi-supervised learning algorithm by introducing two new techniques: distribution alignment and augmentation anchoring. ... Re Mix Match, is significantly more data-efficient than prior work, requiring between 5 and 16 less data to reach the same accuracy. For example, on CIFAR10 with 250 labeled examples we reach 93.73% accuracy ... We call our improved algorithm Re Mix Match and experimentally validate it on a suite of standard SSL image benchmarks. Re Mix Match achieves state-of-the-art accuracy across all labeled data amounts... |
| Researcher Affiliation | Industry | David Berthelot, Nicholas Carlini, Ekin D. Cubuk, Alex Kurakin, Han Zhang, Colin Raffel Google Research {dberth,ncarlini,cubuk,kurakin,zhanghan,craffel}@google.com Kihyuk Sohn Google Cloud AI kihyuks@google.com |
| Pseudocode | Yes | Algorithm 1 Re Mix Match algorithm for producing a collection of processed labeled examples and processed unlabeled examples with label guesses (cf. Berthelot et al. (2019) Algorithm 1.) |
| Open Source Code | Yes | We make our code and data open-source at https://github.com/google-research/remixmatch. |
| Open Datasets | Yes | For example, on CIFAR10 with 250 labeled examples we reach 93.73% accuracy... We experimentally validate it on a suite of standard SSL image benchmarks. ... CIFAR-10 Our results on CIFAR-10 are shown in table 1, left. SVHN Results for SVHN are shown in table 1, right. The STL-10 dataset consists of 5,000 labeled 96 96 color images drawn from 10 classes and 100,000 unlabeled images... |
| Dataset Splits | Yes | We follow the Realistic Semi-Supervised Learning (Oliver et al., 2018) recommendations for performing SSL evaluations. ... The STL-10 dataset consists of 5,000 labeled... The labeled set is partitioned into ten pre-defined folds of 1,000 images each. For efficiency, we only run our analysis on five of these ten folds. ... We sort the table by error rate over five different splits (i.e., 40-label subsets) of the training data. |
| Hardware Specification | No | No specific hardware details (like GPU models, CPU types, or cloud instance names) are mentioned in the paper. It only mentions using a 'Wide Res Net-28-2'. |
| Software Dependencies | No | The paper mentions using 'Adam (Kingma & Ba, 2015)' as an optimizer, but does not provide specific version numbers for any software dependencies, libraries, or programming languages. |
| Experiment Setup | Yes | Re Mix Match introduce two new hyperparameters: the weight on the rotation loss λr and the weight on the un-augmented example λ ˆU1. In practice both are fixed λr = λ ˆU1 = 0.5. Re Mix Match also shares many hyperparameters from Mix Match: the weight for the unlabeled loss λU, the sharpening temperature T, the Mix Up Beta parameter, and the number of augmentations K. All experiments (unless otherwise stated) use T = 0.5, Beta = 0.75, and λU = 1.5. We found using a larger number of augmentations monotonically increases accuracy, and so set K = 8 for all experiments (as running with K augmentations increases computation by a factor of K). We train our models using Adam (Kingma & Ba, 2015) with a fixed learning rate of 0.002 and weight decay (Zhang et al., 2018) with a fixed value of 0.02. We take the final model as an exponential moving average over the trained model weights with a decay of 0.999. |