Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Leveraging Local Variance for Pseudo-Label Selection in Semi-supervised Learning
Authors: Zeping Min, Jinfeng Bai, Chengfei Li
AAAI 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our methodology is validated through a series of experiments on widely-used image classification datasets, such as CIFAR-10, CIFAR-100, and SVHN, spanning various labeled data quantity scenarios. The empirical findings show that the LVM method substantially outpaces current SSL techniques, achieving stateof-the-art results in many of these scenarios. |
| Researcher Affiliation | Collaboration | 1Peking University, Beijing, China 2TAL Education Group, Beijing, China EMAIL, EMAIL, EMAIL |
| Pseudocode | Yes | Algorithm 1: One iteration of Local Variance Match (LVM) Algorithm |
| Open Source Code | No | The paper does not contain an explicit statement or a link indicating that the source code for the LVM method is publicly available. |
| Open Datasets | Yes | We assess the performance of our Local Variance Match (LVM) approach in image classification across several benchmark datasets: CIFAR-10, CIFAR100, SVHN, and Image Net (Russakovsky et al. 2015)... For the speech recognition experiment, we use the AISHELL-1 dataset (Bu et al. 2017) (approximately 150 hours) as the labeled data and the AISHELL-2 dataset (Du et al. 2018) (approximately 1000 hours) as the unlabeled data. |
| Dataset Splits | No | The paper specifies the number of labeled data points used for training various datasets (e.g., "40, 250, and 4000 labeled data points for CIFAR-10"), but it does not provide explicit percentages, counts, or references to predefined train/validation/test splits for the full datasets, including the unlabeled portions. |
| Hardware Specification | Yes | All experiments were carried out on four Tesla V100 32GB GPUs. |
| Software Dependencies | No | The paper mentions various models, optimizers, and toolkits (e.g., "Wide Res Net-28-2", "SGD optimizer", "We Net"), but it does not provide specific version numbers for ancillary software dependencies such as Python, PyTorch, TensorFlow, or CUDA. |
| Experiment Setup | Yes | Under the most stringent conditions (10 labeled data points for CIFAR-10 and 400 labeled data points for CIFAR100), we utilize the SGD optimizer with the following parameters: a learning rate of 0.03, a cosine learning rate decay schedule, momentum of 0.9, and a weight decay of 0.0005. We set τ1 to a relative value of 0.97, meaning we exclude the top 3% of pseudo-labels with high local variance. |