Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
On Pseudo-Labeling for Class-Mismatch Semi-Supervised Learning
Authors: Lu Han, Han-Jia Ye, De-Chuan Zhan
TMLR 2022 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments show that our method achieves steady improvement over supervised baseline and state-of-the-art performance under all class mismatch ratios on different benchmarks. ... Experiments on different SSL benchmarks empirically validate the effectiveness of our model. |
| Researcher Affiliation | Academia | Lu Han EMAIL State Key Laboratory for Novel Software Technology, Nanjing University Han-Jia Ye EMAIL State Key Laboratory for Novel Software Technology, Nanjing University De-Chuan Zhan EMAIL State Key Laboratory for Novel Software Technology, Nanjing University |
| Pseudocode | Yes | Algorithm 1 Υ-Model algorithm |
| Open Source Code | No | The paper does not provide concrete access to source code. It only mentions 'Reviewed on Open Review: https: // openreview. net/ forum? id= t LG26Qxo D8' which is a review platform and does not host code. |
| Open Datasets | Yes | CIFAR10 (6/4) : created from CIFAR10 (Krizhevsky & Hinton, 2009). ... CIFAR100 (50/50): created from CIFAR100 (Krizhevsky & Hinton, 2009). ... Tiny Image Net (100/100): created from Tiny Image Net, which is a subset of Image Net (Deng et al., 2009) ... Image Net100 (50/50): created from the 100 class subset of Image Net (Deng et al., 2009). |
| Dataset Splits | Yes | CIFAR10 (6/4) : ... We select 400 labeled samples for each ID class and totally 20,000 unlabeled samples from ID and OOD classes. SVHN (6/4): ... We select 100 labeled samples for each ID class and totally 20,000 unlabeled samples. CIFAR100 (50/50): ... We select 100 labeled samples for each ID class and a total of 20,000 unlabeled samples. Tiny Image Net (100/100): ... We select 100 labeled samples for each ID class and 40,000 unlabeled samples. Image Net100 (50/50): ... We select 100 labeled samples for each ID class and a total of 20,000 unlabeled samples. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU/CPU models, processor types, or memory used for running its experiments. It only mentions using "Wide-Res Net-28-2" as the backbone. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers. It mentions "Adam as the optimization algorithm" but does not specify the version of Adam or the framework/library used (e.g., PyTorch, TensorFlow, Scikit-learn). |
| Experiment Setup | Yes | For each epoch, we iterate over the unlabeled set and random sample labeled data, each unlabeled and labeled mini-batch contains 128 samples. We adopt Adam as the optimization algorithm with the initial learning rate 3 10 3 and train for 400 epochs. ... We first train a classification model only on labeled data for 100 epochs without RPL and SEC. We update pseudo-labels every 2 epochs. For both datasets, we set τ = 0.95, γ = 0.3,K = 4. |