Realistic Evaluation of Deep Semi-Supervised Learning Algorithms
Authors: Avital Oliver, Augustus Odena, Colin A. Raffel, Ekin Dogus Cubuk, Ian Goodfellow
NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | After creating a unified reimplementation of various widely-used SSL techniques, we test them in a suite of experiments designed to address these issues. |
| Researcher Affiliation | Industry | Google Brain {avitalo,augustusodena,craffel,cubuk,goodfellow}@google.com |
| Pseudocode | No | The paper describes algorithms but does not provide structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | To help guide SSL research towards real-world applicability, we make our unified reimplemention and evaluation platform publicly available.2... 2https://github.com/brain-research/realistic-ssl-evaluation |
| Open Datasets | Yes | We tested each SSL approach on the widely-reported image classification benchmarks of SVHN [40] with all but 1000 labels discarded and CIFAR-10 [31] with all but 4,000 labels discarded. |
| Dataset Splits | Yes | We optimized hyperparameters to minimize classification error on the standard validation set from each dataset, as is common practice (an approach we evaluate critically in section 4.6). |
| Hardware Specification | No | For every SSL technique in addition to a fully-supervised (not utilizing unlabeled data) baseline, we ran 1000 trials of Gaussian Process-based black box optimization using Google Cloud ML Engine s hyperparameter tuning service [18]. |
| Software Dependencies | No | The paper mentions software components like 'Adam optimizer' and 'Wide Res Net' but does not specify their version numbers or the versions of underlying libraries or programming languages. |
| Experiment Setup | Yes | We chose a Wide Res Net [52], due to their widespread adoption and availability. Specifically, we used WRN-28-2... For training, we chose the ubiquitous Adam optimizer [29]. For all datasets, we followed standard procedures for regularization, data augmentation, and preprocessing; details are in appendix B. ... An enumeration of these hyperparameter settings can be found in appendix C. |