Leveraging Local Variance for Pseudo-Label Selection in Semi-supervised Learning
Authors: Zeping Min, Jinfeng Bai, Chengfei Li
AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our methodology is validated through a series of experiments on widely-used image classification datasets, such as CIFAR-10, CIFAR-100, and SVHN, spanning various labeled data quantity scenarios. The empirical findings show that the LVM method substantially outpaces current SSL techniques, achieving stateof-the-art results in many of these scenarios. |
| Researcher Affiliation | Collaboration | 1Peking University, Beijing, China 2TAL Education Group, Beijing, China zpm@pku.edu.cn, baijinfeng1@tal.com, lichengfei@tal.com |
| Pseudocode | Yes | Algorithm 1: One iteration of Local Variance Match (LVM) Algorithm |
| Open Source Code | No | The paper does not contain an explicit statement or a link indicating that the source code for the LVM method is publicly available. |
| Open Datasets | Yes | We assess the performance of our Local Variance Match (LVM) approach in image classification across several benchmark datasets: CIFAR-10, CIFAR100, SVHN, and Image Net (Russakovsky et al. 2015)... For the speech recognition experiment, we use the AISHELL-1 dataset (Bu et al. 2017) (approximately 150 hours) as the labeled data and the AISHELL-2 dataset (Du et al. 2018) (approximately 1000 hours) as the unlabeled data. |
| Dataset Splits | No | The paper specifies the number of labeled data points used for training various datasets (e.g., "40, 250, and 4000 labeled data points for CIFAR-10"), but it does not provide explicit percentages, counts, or references to predefined train/validation/test splits for the full datasets, including the unlabeled portions. |
| Hardware Specification | Yes | All experiments were carried out on four Tesla V100 32GB GPUs. |
| Software Dependencies | No | The paper mentions various models, optimizers, and toolkits (e.g., "Wide Res Net-28-2", "SGD optimizer", "We Net"), but it does not provide specific version numbers for ancillary software dependencies such as Python, PyTorch, TensorFlow, or CUDA. |
| Experiment Setup | Yes | Under the most stringent conditions (10 labeled data points for CIFAR-10 and 400 labeled data points for CIFAR100), we utilize the SGD optimizer with the following parameters: a learning rate of 0.03, a cosine learning rate decay schedule, momentum of 0.9, and a weight decay of 0.0005. We set τ1 to a relative value of 0.97, meaning we exclude the top 3% of pseudo-labels with high local variance. |