Train a One-Million-Way Instance Classifier for Unsupervised Visual Representation Learning

Authors: Yu Liu, Lianghua Huang, Pan Pan, Bin Wang, Yinghui Xu, Rong Jin8706-8714

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate our method under Image Net linear evaluation protocol and on several downstream tasks related to detection or fine-grained classification.
Researcher Affiliation Industry Yu Liu, Lianghua Huang, Pan Pan, Bin Wang, Yinghui Xu, Rong Jin Machine Intelligence Technology Lab, Alibaba Group {ly103369, xuangen.hlh, panpan.pp, ganfu.wb, renji.xyh, jinrong.jr}@alibaba-inc.com
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code No The paper does not provide concrete access to source code for the methodology described in this paper.
Open Datasets Yes Unless specified, we use Image Net-1K to train our unsupervised model for most experiments. Image Net-1K consists of around 1.28 million images belonging to 1000 classes.
Dataset Splits Yes Semi-supervised Learning performance on Image Net-1K, where methods are required to classify images in the val set when only a small fraction (i.e., 1% or 10%) of manual labels are provided in the train set.
Hardware Specification Yes All experiments are conducted on 64 V100 GPUs with 32GB memory.
Software Dependencies No The paper mentions using Res Net-50 and SGD optimizer but does not specify software dependencies with version numbers (e.g., PyTorch, TensorFlow, CUDA versions).
Experiment Setup Yes We use Res Net-50 (He et al. 2016) as the backbone in all our experiments. We train our model using the SGD optimizer, where the weight decay and momentum are set to 0.0001 and 0.9, respectively. The initial learning rate (lr) is set to 0.48 and decays using the cosine annealing scheduler. In addition, we use 10 epochs of linear lr warmup to stabilize training. The minibatch size is 4096 and the feature dimension D = 128. We set the temperature in Eq. (1) as τ = 0.15, and the smoothing factor in Eq. (3) as α = 0.2.