reproducibilityindex.ai

How Does Semi-supervised Learning with Pseudo-labelers Work? A Case Study

Authors: Yiwen Kou, Zixiang Chen, Yuan Cao, Quanquan Gu

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section, we perform numerical experiments on synthetic datasets, generated according to Definition 3.1, to verify our main theoretical results. The code and data for our experiments can be found on Github.
Researcher Affiliation	Academia	Yiwen Kou1, Zixiang Chen1, Yuan Cao2,3, Quanquan Gu1 1Department of Computer Science, University of California, Los Angeles 2Department of Statistics and Actuarial Science, The University of Hong Kong 3Department of Mathematics, The University of Hong Kong
Pseudocode	No	The paper describes the algorithms and training procedures mathematically and textually but does not include explicit pseudocode or algorithm blocks.
Open Source Code	Yes	The code and data for our experiments can be found on Github 1. 1https://github.com/uclaml/SSL Pseudo Labeler
Open Datasets	No	The paper states that experiments are performed on "synthetic datasets, generated according to Definition 3.1". It does not provide a link or citation to a pre-existing publicly available dataset.
Dataset Splits	No	The paper specifies 'labeled training sample size nl = 20' and 'pseudo-labeled training sample size nu = 20000' but does not explicitly describe training/validation/test splits or cross-validation settings.
Hardware Specification	No	The paper does not specify the hardware (e.g., GPU, CPU models) used for running the experiments.
Software Dependencies	No	The paper mentions 'activation function σ(z) = [z]3 +' but does not specify any software libraries or their version numbers.
Experiment Setup	Yes	In particular, we set the problem dimension d = 10000, labeled training sample size nl = 20 (10 positive samples and 10 negative samples), pseudo-labeled training sample size nu = 20000 (10000 positive samples and 10000 negative samples), feature vector v sampled from distribution N(0, I) and noise vector sampled from distribution N(0, σ2 p I) where σp = 10d0.01. ... network width m = 20, activation function σ(z) = [z]3 +, regularization parameter λ = 0.1 and learning rate η = 1 × 10−4. Besides, we initialize CNN parameters from N(0, σ2 0), where σ0 = 0.1 d−3/4. After 200 iterations, ... By applying learning rate η = 0.1 and after T = 100 iterations.