On Data-Augmentation and Consistency-Based Semi-Supervised Learning
Authors: Atin Ghosh, Alexandre H. Thiery
ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this text, we analyse (variations of) the Π-model in settings where analytically tractable results can be obtained. We establish links with Manifold Tangent Classifiers and demonstrate that the quality of the perturbations is key to obtaining reasonable SSL performances. Importantly, we propose a simple extension of the Hidden Manifold Model that naturally incorporates data-augmentation schemes and offers a framework for understanding and experimenting with SSL methods. ... Our numerical experiments suggest that, in the standard setting when the number of labelled samples is much lower than the number of unlabeled samples, i.e. |DL| |DU|, the formulation equation 1 of the consistency regularization leads to sub-optimal results and convergence issues: the information contained in the labelled data is swamped by the number of unlabelled samples. ... Figure 3 (Left) shows that this method is relatively insensitive to the parameter λ, as long as it is within reasonable bounds. ... Figure 3 (Right) reports the generalization properties of the method for different amount of data-augmentation. |
| Researcher Affiliation | Academia | Atin Ghosh & Alexandre H. Thiery Department of Statistics and Applied Probability National University of Singapore atin.ghosh@u.nus.edu a.h.thiery@nus.edu.sg |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide any explicit statements or links indicating the availability of open-source code for the described methodology. |
| Open Datasets | No | The paper introduces a 'Generative Model for Semi-Supervised Learning' (Section 4.3), and explicitly states that it is 'without relying on a specific dataset.' The data is generated based on a Hidden Manifold Model, rather than being a pre-existing public dataset for which access information would be provided. |
| Dataset Splits | No | The paper mentions '10 labelled data pairs' and '1000 unlabelled data samples' for the generative model, but it does not specify explicit train/validation/test splits, percentages, or a methodology for partitioning the data for these purposes. Cross-validation or predefined splits are not mentioned. |
| Hardware Specification | No | The paper does not provide specific details about the hardware (e.g., GPU models, CPU types) used for running the experiments. |
| Software Dependencies | No | The paper does not provide specific version numbers for any software components, libraries, or frameworks used (e.g., Python, PyTorch, TensorFlow). |
| Experiment Setup | Yes | In all our experiments, we use a standard Stochastic Gradient Descent (SGD) method with constant learning rate and momentum β = 0.9. ... In all the experiments, we used λ = 10 and used SGD with momentum β = 0.9. ... We use ε = 0.3 in all the experiments. ... For βMT {0.9, 0.95, 0.99, 0.995}, the final test NLL obtained through the MT approach is identical to the test NLL obtained through the Π-model. |