Leveraging Local Variance for Pseudo-Label Selection in Semi-supervised Learning

Authors: Zeping Min, Jinfeng Bai, Chengfei Li

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our methodology is validated through a series of experiments on widely-used image classification datasets, such as CIFAR-10, CIFAR-100, and SVHN, spanning various labeled data quantity scenarios. The empirical findings show that the LVM method substantially outpaces current SSL techniques, achieving stateof-the-art results in many of these scenarios.
Researcher Affiliation Collaboration 1Peking University, Beijing, China 2TAL Education Group, Beijing, China zpm@pku.edu.cn, baijinfeng1@tal.com, lichengfei@tal.com
Pseudocode Yes Algorithm 1: One iteration of Local Variance Match (LVM) Algorithm
Open Source Code No The paper does not contain an explicit statement or a link indicating that the source code for the LVM method is publicly available.
Open Datasets Yes We assess the performance of our Local Variance Match (LVM) approach in image classification across several benchmark datasets: CIFAR-10, CIFAR100, SVHN, and Image Net (Russakovsky et al. 2015)... For the speech recognition experiment, we use the AISHELL-1 dataset (Bu et al. 2017) (approximately 150 hours) as the labeled data and the AISHELL-2 dataset (Du et al. 2018) (approximately 1000 hours) as the unlabeled data.
Dataset Splits No The paper specifies the number of labeled data points used for training various datasets (e.g., "40, 250, and 4000 labeled data points for CIFAR-10"), but it does not provide explicit percentages, counts, or references to predefined train/validation/test splits for the full datasets, including the unlabeled portions.
Hardware Specification Yes All experiments were carried out on four Tesla V100 32GB GPUs.
Software Dependencies No The paper mentions various models, optimizers, and toolkits (e.g., "Wide Res Net-28-2", "SGD optimizer", "We Net"), but it does not provide specific version numbers for ancillary software dependencies such as Python, PyTorch, TensorFlow, or CUDA.
Experiment Setup Yes Under the most stringent conditions (10 labeled data points for CIFAR-10 and 400 labeled data points for CIFAR100), we utilize the SGD optimizer with the following parameters: a learning rate of 0.03, a cosine learning rate decay schedule, momentum of 0.9, and a weight decay of 0.0005. We set τ1 to a relative value of 0.97, meaning we exclude the top 3% of pseudo-labels with high local variance.