Improving Barely Supervised Learning by Discriminating Unlabeled Samples with Super-Class

Authors: Guan Gui, Zhen Zhao, Lei Qi, Luping Zhou, Lei Wang, Yinghuan Shi

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We compare our method with state-of-the-art SSL and BSL methods through extensive experiments on standard SSL benchmarks. Our method can achieve superior results, e.g., an average accuracy of 76.76% on CIFAR-10 with merely 1 label per class.
Researcher Affiliation Academia Guan Gui Nanjing University guiguan@smail.nju.edu.cn Zhen Zhao University of Sydney zhen.zhao@sydney.edu.au Lei Qi Southeast University qilei@seu.edu.cn Luping Zhou University of Sydney luping.zhou@sydney.edu.au Lei Wang University of Wollongong leiw@uow.edu.au Yinghuan Shi Nanjing University syh@nju.edu.cn
Pseudocode Yes Algorithm 1 Algorithm of our method
Open Source Code Yes The code is available at https://github.com/Guan Gui-nju/SCMatch.
Open Datasets Yes We validate the effectiveness of our proposed method by conducting experiments on widely-used SSL benchmark datasets: CIFAR-10, CIFAR-100 [12], and STL-10 [13].
Dataset Splits No The paper uses standard SSL benchmark datasets (CIFAR-10, CIFAR-100, STL-10) but does not explicitly state the training/validation/test dataset splits (e.g., percentages or counts) within the paper. It only mentions sampling 1 or 2 labels per class for the labeled subset.
Hardware Specification No The paper does not explicitly describe the specific hardware used to run the experiments, such as GPU or CPU models.
Software Dependencies No The paper mentions using 'Wide Res Net-28-2', 'Wide Res Net-28-8', and 'Res Net18' as backbone networks, along with data augmentation techniques like 'Rand Augment' and 'Cut Out', but does not provide specific software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup Yes For the consistent information learned module, we follow the same setting with [3], where τ1 = 0.95, |Bx| = 64, |Bu| = 7|Bx|. And for the discriminative information learned module, we set T = 1, τ2 = 0.8. In addition, Since the essence of the three losses of Lsup, Lcon, Ldis is in the form of cross entropy, it s prefer to set λcon = λdis = 1 to further reduce of hyperparameters. For CIFAR-10 and STL-10 task, we set the K { nk/3 , nk/2, nk} = {3, 5, 10}. For CIFAR-100, considering that the samples of each cluster should be sufficient, we set the K {nk/20, nk/10, nk/5} = {5, 10, 20}. The model is trained with a total of 220 iterations, and the K increased in the first 30% iterations.