Self-Tuning for Data-Efficient Deep Learning

Authors: Ximei Wang, Jinghan Gao, Mingsheng Long, Jianmin Wang

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental 5. Experiments We empirically evaluate Self-Tuning in several dimensions: (1) Task Variety: four visual tasks with various dataset scales including CUB-200-2011 (Wah et al., 2011), Stanford Cars (Krause et al., 2013) and FGVC Aircraft (Maji et al., 2013) and CIFAR-100 (Krizhevsky & Hinton, 2009) , as well as one NLP task: Co NLL 2013 (Sang & Meulder, 2003).
Researcher Affiliation Academia 1School of Software, BNRist, Tsinghua University, Beijing, China, 100084.
Pseudocode No The paper describes the proposed method using textual explanations and mathematical equations, but it does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code Yes Code will be available at github.com/thuml/Self-Tuning.
Open Datasets Yes CUB-200-2011 (Wah et al., 2011), Stanford Cars (Krause et al., 2013) and FGVC Aircraft (Maji et al., 2013) and CIFAR-100 (Krizhevsky & Hinton, 2009) , as well as one NLP task: Co NLL 2013 (Sang & Meulder, 2003).
Dataset Splits Yes Label Proportion: the proportion of labeled dataset ranging from 15% to 50% following the common practice of transfer learning, as well as including 4 labels and 25 labels per class following the popular protocol of semi-supervised learning.
Hardware Specification No The paper does not provide specific hardware details such as GPU models, CPU types, or memory specifications used for running the experiments. It only mentions the network architectures and pre-trained models utilized.
Software Dependencies No The paper mentions optimizers (SGD with momentum 0.9) and data augmentation methods (Rand Augment) but does not provide specific version numbers for any software dependencies or libraries (e.g., Python, PyTorch, TensorFlow, CUDA).
Experiment Setup Yes Following Mo Co (He et al., 2020), we adopted a default temperature τ = 0.07, a learning rate lr = 0.001 and a queue size D = 32 for each category. SGD with a momentum of 0.9 is adopted as the optimizer. Each experiment is repeated three times with different random seeds.