Dash: Semi-Supervised Learning with Dynamic Thresholding

Authors: Yi Xu, Lei Shang, Jinxing Ye, Qi Qian, Yu-Feng Li, Baigui Sun, Hao Li, Rong Jin

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Finally, we empirically demonstrate the effectiveness of the proposed method in comparison with state-of-the-art over benchmarks.
Researcher Affiliation Collaboration 1Machine Intelligence Technology, Alibaba Group. 2National Key Laboratory for Novel Software Technology, Nanjing University.
Pseudocode Yes Algorithm 1 Dash: Semi-Supervised Learning with Dynamic Thresholding
Open Source Code No In our experiments, the Fix Match codebase is used: https://github.com/google-research/fixmatch. This link points to the codebase of a method used as a pipeline, not the specific implementation of Dash itself.
Open Datasets Yes We compare it with several state-of-the-art (SOTA) baselines on several standard SSL image classification benchmarks including CIFAR-10, CIFAR-100 (Krizhevsky & Hinton, 2009), SVHN (Netzer et al., 2011), and STL-10 (Coates et al., 2011) data sets.
Dataset Splits No The paper details training and testing splits for the datasets but does not explicitly mention a separate "validation" split. For example: "The original CIFAR data-sets have 50,000 training images and 10,000 testing images" and "The original SVHN data-set has 73,257 digits for training and 26,032 digits for testing".
Hardware Specification No The paper mentions the use of Wide ResNet models as backbone architectures but does not specify any hardware details like GPU models, CPU types, or memory used for experiments.
Software Dependencies No The paper mentions using "Fix Match as our pipeline," "CTAugment," "Rand Augment," an "optimizer," and a "learning rate schedule," but no specific version numbers for any software components are provided.
Experiment Setup Yes The total number of training epochs is set to be 1024 and the mini-bach size is fixed as 64. For the value of weight decay, we use 5 × 10−4 for CIFAR-10, SVHN and STL-10, 1 × 10−3 for CIAR-100. The SGD with momentum parameter of 0.9 is employed as the optimizer. The cosine learning rate decay schedule ... The initial learning rate is set to be 0.06 for all data-sets. ... we choose γ = 1.27 in ρt to reduce the dynamic threshold until its value to be 0.05. ... We fix the constant C as 1.0001 ... We decay the dynamic threshold every 9 epochs. We use the predicted label distribution as soft label during the training and it is sharpened by adjusting its temperature of 0.5