Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Barely-Supervised Learning: Semi-supervised Learning with Very Few Labeled Images
Authors: Thomas Lucas, Philippe Weinzaepfel, Gregory Rogez1881-1889
AAAI 2022 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiments show that our approach performs significantly better on STL-10 in the barely-supervised regime, e.g. with 4 or 8 labeled images per class. ... Summary of our main contributions: An analysis of the distillation dilemma in Fix Match. ... Experiments showing that our approach allows barelysupervised learning on the more realistic STL-10 dataset. |
| Researcher Affiliation | Industry | Thomas Lucas1, Philippe Weinzaepfel1, Gregory Rogez1 1Naver Labs Europe* |
| Pseudocode | No | The paper includes a high-level overview diagram in Figure 1, but no detailed pseudocode or algorithm blocks are present. |
| Open Source Code | No | The paper does not contain any explicit statements or links indicating that source code for the described methodology is publicly available. |
| Open Datasets | Yes | We perform most ablations on STL-10 and also compare approaches on CIFAR-10 and CIFAR-100. ... The STL-10 dataset consists of 5k labeled images of resolution 96 96 split into 10 classes, and 100k unlabeled images. |
| Dataset Splits | Yes | We use various amounts of labeled data: 10 (1 image per class), 20, 40, 80, 250, 1000. ... We average across 4 random seeds for 4 images per class or less, 3 otherwise, and across the last 10 checkpoints of all runs. |
| Hardware Specification | No | The paper mentions the use of Wide-Res Net architectures (WR-28-2, WR-28-8, WR-37-2) but does not specify any hardware details like GPU models, CPU types, or other computing resources used for experiments. |
| Software Dependencies | No | The paper does not provide specific version numbers for any software dependencies (e.g., programming languages, libraries, frameworks) used in the experiments. |
| Experiment Setup | Yes | We use τ = 0.95 for Fix Match and τ = 0.98 for our model, see Section 5.3 for discussions about setting τ. ... Standard deviations increase as the number of labels decreases, so we average across 4 random seeds for 4 images per class or less, 3 otherwise, and across the last 10 checkpoints of all runs. |