FlexMatch: Boosting Semi-Supervised Learning with Curriculum Pseudo Labeling

Authors: Bowen Zhang, Yidong Wang, Wenxin Hou, HAO WU, Jindong Wang, Manabu Okumura, Takahiro Shinozaki

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate Flex Match and other CPL-enabled algorithms on common SSL datasets: CIFAR10/100 [23], SVHN [24], STL-10 [25] and Image Net [26], and extensively investigate the performance under various labeled data amounts. We mainly compare our method with Pseudo-Labeling [4], UDA [11] and Fix Match [14]... The classification error rates on CIFAR-10/100, STL-10 and SVHN datasets are in Table 1...
Researcher Affiliation Collaboration Bowen Zhang Tokyo Institute of Technology bowen.z.ab@m.titech.ac.jp Yidong Wang Tokyo Institute of Technology wang.y.ca@m.titech.ac.jp Wenxin Hou Microsoft wenxinhou@microsoft.com Hao Wu Tokyo Institute of Technology wu.h.aj@m.titech.ac.jp Jindong Wang Microsoft Research Asia jindwang@microsoft.com Manabu Okumura Tokyo Institute of Technology oku@pi.titech.ac.jp Takahiro Shinozaki Tokyo Institute of Technology shinot@ict.e.titech.ac.jp
Pseudocode Yes Algorithm 1 Flex Match algorithm.
Open Source Code Yes We open-source our code at https://github.com/Torch SSL/Torch SSL.
Open Datasets Yes We evaluate Flex Match and other CPL-enabled algorithms on common SSL datasets: CIFAR10/100 [23], SVHN [24], STL-10 [25] and Image Net [26]
Dataset Splits Yes We evaluate Flex Match and other CPL-enabled algorithms on common SSL datasets: CIFAR10/100 [23], SVHN [24], STL-10 [25] and Image Net [26] and extensively investigate the performance under various labeled data amounts... The classification error rates on CIFAR-10/100, STL-10 and SVHN datasets are in Table 1.
Hardware Specification Yes Figure 2 shows the average running time of a single iteration on a single Ge Force RTX 3090 GPU.
Software Dependencies No The paper mentions 'Py Torch [28]' and cites its publication, but does not provide a specific version number (e.g., PyTorch 1.x.x) for the software dependencies used in the experiments.
Experiment Setup Yes Concretely, the optimizer for all experiments is standard stochastic gradient descent (SGD) with a momentum of 0.9 [29, 30]. For all datasets, we use an initial learning rate of 0.03 with a cosine learning rate decay schedule [31] as η = η0 cos( 7πk / 16K ), where η0 is the initial learning rate, k is the current training step and K is the total training step that is set to 2^20. We also perform an exponential moving average with the momentum of 0.999. The batch size of labeled data is 64 except for Image Net. µ is set to be 1 for Pseudo-Label and 7 for UDA, Fix Match, and Flex Match. τ is set to 0.8 for UDA and 0.95 for Pseudo Label, Fix Match, and Flex Match.