Realistic Evaluation of Semi-supervised Learning Algorithms in Open Environments
Authors: Lin-Han Jia, Lan-Zhe Guo, Zhi Zhou, Yu-Feng Li
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | we re-implement widely adopted SSL algorithms within a unified SSL toolkit and evaluate their performance on proposed open environment SSL benchmarks, including both image, text, and tabular datasets. |
| Researcher Affiliation | Academia | National Key Laboratory for Novel Software Technology1 School of Artificial Intelligence2 School of Intelligence Science and Technology3 Nanjing University, Nanjing 210023, China |
| Pseudocode | No | The paper describes algorithms and theoretical frameworks but does not include structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | The re-implementation and benchmark datasets are all publicly available. More details can be found at https://ygzwqzd.github.io/Robust-SSL-Benchmark. |
| Open Datasets | Yes | We conduct experiments on various types of datasets, including 3 tabular datasets: iris, wine, letter; 3 image datasets: Image-CLEF (Caputo et 2014), CIFAR-10, and CIFAR-100; 3 text datasets: Amazon reviews (Mc Auley & Leskovec, 2013), IMDB movie reviews (Maas et al., 2011), and agnews (Zhang et al., 2015). |
| Dataset Splits | Yes | To ensure reliability, we conducted three experiments for each sampling point with seed values of 0 2. The average of these experiments was used to plot the curve. Linear interpolation was performed between adjacent sampling points. More detailed settings of the experiments are presented in appendix A.5. From the source-domain data, 100 examples are taken as labeled data. Half of the remaining source-domain examples are used as test data, while the other half is combined with the target-domain data to form an unlabeled dataset. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU/CPU models, memory, or cloud instance types used for running experiments. |
| Software Dependencies | No | The paper mentions software like XGBoost, FT-Transformer, ResNet50, and RoBERTa, and packages like `torchvision.models`, `transformers package`, and `scikit-learn`, but does not provide specific version numbers for these software components or libraries. |
| Experiment Setup | Yes | A.5.2 HYPER-PARAMETERS OF COMPARED ALGORITHMS" section details parameters such as "ratio of unsupervised loss λu is set to 1.0", "threshold is set to 0.95", "EMA decay is set to 0.999", "batch size", "iteration", "optimizer: SGD with a learning rate of 5e-4 and a momentum of 0.9", and "scheduler: Cosine Warmup". |