On Data-Augmentation and Consistency-Based Semi-Supervised Learning

Authors: Atin Ghosh, Alexandre H. Thiery

ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this text, we analyse (variations of) the Π-model in settings where analytically tractable results can be obtained. We establish links with Manifold Tangent Classifiers and demonstrate that the quality of the perturbations is key to obtaining reasonable SSL performances. Importantly, we propose a simple extension of the Hidden Manifold Model that naturally incorporates data-augmentation schemes and offers a framework for understanding and experimenting with SSL methods. ... Our numerical experiments suggest that, in the standard setting when the number of labelled samples is much lower than the number of unlabeled samples, i.e. |DL| |DU|, the formulation equation 1 of the consistency regularization leads to sub-optimal results and convergence issues: the information contained in the labelled data is swamped by the number of unlabelled samples. ... Figure 3 (Left) shows that this method is relatively insensitive to the parameter λ, as long as it is within reasonable bounds. ... Figure 3 (Right) reports the generalization properties of the method for different amount of data-augmentation.
Researcher Affiliation Academia Atin Ghosh & Alexandre H. Thiery Department of Statistics and Applied Probability National University of Singapore atin.ghosh@u.nus.edu a.h.thiery@nus.edu.sg
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code No The paper does not provide any explicit statements or links indicating the availability of open-source code for the described methodology.
Open Datasets No The paper introduces a 'Generative Model for Semi-Supervised Learning' (Section 4.3), and explicitly states that it is 'without relying on a specific dataset.' The data is generated based on a Hidden Manifold Model, rather than being a pre-existing public dataset for which access information would be provided.
Dataset Splits No The paper mentions '10 labelled data pairs' and '1000 unlabelled data samples' for the generative model, but it does not specify explicit train/validation/test splits, percentages, or a methodology for partitioning the data for these purposes. Cross-validation or predefined splits are not mentioned.
Hardware Specification No The paper does not provide specific details about the hardware (e.g., GPU models, CPU types) used for running the experiments.
Software Dependencies No The paper does not provide specific version numbers for any software components, libraries, or frameworks used (e.g., Python, PyTorch, TensorFlow).
Experiment Setup Yes In all our experiments, we use a standard Stochastic Gradient Descent (SGD) method with constant learning rate and momentum β = 0.9. ... In all the experiments, we used λ = 10 and used SGD with momentum β = 0.9. ... We use ε = 0.3 in all the experiments. ... For βMT {0.9, 0.95, 0.99, 0.995}, the final test NLL obtained through the MT approach is identical to the test NLL obtained through the Π-model.