reproducibilityindex.ai

Positional Label for Self-Supervised Vision Transformer

Authors: Zhemin Zhang, Xun Gong

AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments demonstrate that with the proposed self-supervised methods, Vi T-B and Swin-B gain improvements of 1.20% (top-1 Acc) and 0.74% (top-1 Acc) on Image Net, respectively, and 6.15% and 1.14% improvement on Mini-Image Net.
Researcher Affiliation	Academia	Zhemin Zhang1, Xun Gong1,2,3* 1School of Computing and Artificial Intelligence, Southwest Jiaotong University, Chengdu, Sichuan, China 2Engineering Research Center of Sustainable Urban Intelligent Transportation, Ministry of Education, China 3Manufacturing Industry Chains Collaboration and Information Support Technology Key Laboratory of Sichuan Province, Chengdu, Sichuan, China zheminzhang@my.swjtu.edu.cn, xgong@swjtu.edu.cn
Pseudocode	No	The paper describes methods using mathematical equations and figures, but does not include any structured pseudocode or algorithm blocks.
Open Source Code	Yes	The code is publicly available at: https://github.com/zhangzhemin/Positional Label.
Open Datasets	Yes	Dataset. For image classification, we benchmark the proposed positional label on the Image Net-1K, which contains 1.28M training images and 50K validation images from 1,000 classes. To explore the performance of positional label on small datasets, we also conducted experiments on Caltech-256 (Griffin, Holub, and Perona 2007) and Mini-Image Net (Krizhevsky, Sutskever, and Hinton 2012).
Dataset Splits	Yes	For image classification, we benchmark the proposed positional label on the Image Net-1K, which contains 1.28M training images and 50K validation images from 1,000 classes.
Hardware Specification	No	The paper does not provide specific details about the hardware used for running experiments, such as CPU/GPU models or memory.
Software Dependencies	No	We use the Py Torch toolbox (Paszke et al. 2019) to implement all our experiments. While PyTorch is mentioned, a specific version number is not provided, nor are other software dependencies with versions.
Experiment Setup	Yes	We employ an Adam W (Kingma and Ba 2014) optimizer for 300 epochs using a cosine decay learning rate scheduler and 20 epochs of linear warm-up. A batch size of 256, an initial learning rate of 0.001, and a weight decay of 0.05 are used. Vi T-B/16 uses an image size 384 and others use 224.