How unlabeled data improve generalization in self-training? A one-hidden-layer theoretical analysis
Authors: Shuai Zhang, Meng Wang, Sijia Liu, Pin-Yu Chen, Jinjun Xiong
ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments from shallow neural networks to deep neural networks are also provided to justify the correctness of our established theoretical insights on self-training. 4 EMPIRICAL RESULTS |
| Researcher Affiliation | Collaboration | Shuai Zhang Rensselaer Polytechnic Institute Troy, NY, USA 12180 zhangs29@rpi.edu Meng Wang Rensselaer Polytechnic Institute Troy, NY, USA 12180 wangm7@rpi.edu Sijia Liu Michigan State University East Lansing, MI, USA 48824 MIT-IBM Watson AI Lab, IBM Research liusiji5@msu.edu Pin-Yu Chen IBM Research Yorktown Heights, NY, USA 10562 Pin-Yu.Chen@ibm.com Jinjun Xiong University at Buffalo Buffalo NY, USA 14260 jinjun@buffalo.edu |
| Pseudocode | Yes | Table 1: Iterative Self-Training ... Algorithm 1 Iterative Self-Training Algorithm |
| Open Source Code | No | The codes are downloaded from https://github.com/yaircarmon/semisup-adv. This refers to a third-party implementation of self-training that the authors used for their experiments, not their own source code for the theoretical analysis or a novel self-training methodology. |
| Open Datasets | Yes | We evaluate self-training on the augmented CIFAR-10 dataset, which has 50K labeled data. The unlabeled data are mined from 80 Million Tiny Images following the setup in (Carmon et al., 2019)4, and additional 50K images are selected for each class, which is a total of 500K images, to form the unlabeled data. |
| Dataset Splits | No | The paper mentions '50K labeled data' and '500K images...to form the unlabeled data' from CIFAR-10 and Tiny Images, respectively. However, it does not explicitly specify the exact train/validation/test split percentages or sample counts for all three sets. Standard splits for CIFAR-10 are often implied, but not explicitly stated for validation. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU models, CPU types, or cloud computing instance specifications used for running the experiments. |
| Software Dependencies | No | The paper does not list specific software dependencies with their version numbers (e.g., Python 3.x, PyTorch 1.x). |
| Experiment Setup | Yes | The value of λ is selected as p N/(2Kd) except in Figure 8. ... In each iteration, the maximum number of SGD steps T is 10. Self-training terminates if W (ℓ+1) W (ℓ) F / W (ℓ) F 10 4 or reaching 1000 iterations. ... λ and eλ is selected as N/(M + N) and M/(N + M), respectively, and the algorithm stops after 200 epochs. |