Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
The Rich Get Richer: Disparate Impact of Semi-Supervised Learning
Authors: Zhaowei Zhu, Tianyi Luo, Yang Liu
ICLR 2022 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We theoretically and empirically establish the above observation for a broad family of SSL algorithms, which either explicitly or implicitly use an auxiliary pseudo-label . Experiments on a set of image and text classification tasks confirm our claims. |
| Researcher Affiliation | Academia | Zhaowei Zhu , Tianyi Luo , and Yang Liu Computer Science and Engineering, University of California, Santa Cruz EMAIL |
| Pseudocode | No | The paper does not include any pseudocode or clearly labeled algorithm blocks. |
| Open Source Code | Yes | Code is available at github.com/UCSC-REAL/Disparate-SSL. |
| Open Datasets | Yes | For image classification, we experiment on CIFAR-10 and CIFAR-100 datasets (Krizhevsky et al., 2009). ... For text classification, we employ three datasets: Yahoo! Answers (Chang et al., 2008), AG News (Zhang et al., 2015) and Jigsaw Toxicity (Kaggle, 2018). |
| Dataset Splits | Yes | The train/valid/test splitting in the image datasets (CIFAR-10 and CIFAR-100) is 45000:5000:10000. As for the splitting in the text datsets (Yahoo! Answers and AG News), we follow the setting in (Chen et al., 2020). ... In addition, the ratio of train:valid:test on either race or gender sub-population case is 8:1:1 . |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for running the experiments (e.g., GPU models, CPU types, or memory specifications). |
| Software Dependencies | No | The paper mentions software like Mix Match, UDA, and Mix Text, but does not provide specific version numbers for any software dependencies. |
| Experiment Setup | No | The paper describes general settings and dataset sizes, but does not provide specific experimental setup details such as hyperparameter values (e.g., learning rate, batch size, number of epochs, optimizer settings) in the main text or appendix. |