How Does Semi-supervised Learning with Pseudo-labelers Work? A Case Study
Authors: Yiwen Kou, Zixiang Chen, Yuan Cao, Quanquan Gu
ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we perform numerical experiments on synthetic datasets, generated according to Definition 3.1, to verify our main theoretical results. The code and data for our experiments can be found on Github. |
| Researcher Affiliation | Academia | Yiwen Kou1, Zixiang Chen1, Yuan Cao2,3, Quanquan Gu1 1Department of Computer Science, University of California, Los Angeles 2Department of Statistics and Actuarial Science, The University of Hong Kong 3Department of Mathematics, The University of Hong Kong |
| Pseudocode | No | The paper describes the algorithms and training procedures mathematically and textually but does not include explicit pseudocode or algorithm blocks. |
| Open Source Code | Yes | The code and data for our experiments can be found on Github 1. 1https://github.com/uclaml/SSL Pseudo Labeler |
| Open Datasets | No | The paper states that experiments are performed on "synthetic datasets, generated according to Definition 3.1". It does not provide a link or citation to a pre-existing publicly available dataset. |
| Dataset Splits | No | The paper specifies 'labeled training sample size nl = 20' and 'pseudo-labeled training sample size nu = 20000' but does not explicitly describe training/validation/test splits or cross-validation settings. |
| Hardware Specification | No | The paper does not specify the hardware (e.g., GPU, CPU models) used for running the experiments. |
| Software Dependencies | No | The paper mentions 'activation function σ(z) = [z]3 +' but does not specify any software libraries or their version numbers. |
| Experiment Setup | Yes | In particular, we set the problem dimension d = 10000, labeled training sample size nl = 20 (10 positive samples and 10 negative samples), pseudo-labeled training sample size nu = 20000 (10000 positive samples and 10000 negative samples), feature vector v sampled from distribution N(0, I) and noise vector sampled from distribution N(0, σ2 p I) where σp = 10d0.01. ... network width m = 20, activation function σ(z) = [z]3 +, regularization parameter λ = 0.1 and learning rate η = 1 × 10−4. Besides, we initialize CNN parameters from N(0, σ2 0), where σ0 = 0.1 d−3/4. After 200 iterations, ... By applying learning rate η = 0.1 and after T = 100 iterations. |