reproducibilityindex.ai

Patch-level Contrastive Learning via Positional Query for Visual Pre-training

Authors: Shaofeng Zhang, Qiang Zhou, Zhibin Wang, Fan Wang, Junchi Yan

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conduct comprehensive experiments on standard visual benchmarks, including linear probing, finetuning on classification, detection and segmentation, where the proposed PQCL can stably improve the baseline DINO and i BOT a lot.
Researcher Affiliation	Collaboration	Shaofeng Zhang 1 Qiang Zhou 2 Zhibin Wang 2 Fan Wang 2 Junchi Yan 1 1Department of CSE, and Mo E Key Lab of Artificial Intelligence, Shanghai Jiao Tong University 2Alibaba Group.
Pseudocode	No	The paper describes the method using equations and figures, but no explicit pseudocode or algorithm blocks are provided.
Open Source Code	Yes	Code is available at https://github. com/Sherrylone/Query_Contrastive.
Open Datasets	Yes	We conduct self-supervised pre-training on the Image Net-1K (Deng et al., 2009) training set with 1,000 classes... We also transfer the encoder pre-trained by PQCL on MS-COCO (Lin et al., 2014), ADE20K (Zhou et al., 2017), and video segmentation dataset DAVIS 2017 (Pont-Tuset et al., 2017).
Dataset Splits	No	The paper utilizes well-known datasets like ImageNet-1K, MS-COCO, and ADE20K, which have standard splits. However, it does not explicitly state the specific percentages, sample counts, or methodology used for the training/validation/test splits in the context of its own experiments, nor does it explicitly cite which predefined validation splits were used.
Hardware Specification	Yes	The experiments are performed on a work station with 16 V100 GPUs by default (if not otherwise specified).
Software Dependencies	No	The paper mentions optimizers like Adamw, but does not provide specific version numbers for software dependencies such as programming languages, deep learning frameworks (e.g., PyTorch, TensorFlow), or other libraries used for the experiments.
Experiment Setup	Yes	We train with Adamw... and a batch size of 1024... The learning rate is linearly ramped up during the first 30 epochs to its base value... lr = 0.0005, batchsize=256. After warmup, we decay the learning rate with a cosine schedule... The weight decay also follows a cosine scheduled from 0.04 to 0.4. The temperature τ is set to 0.04 while we use a linear warm-up for τt from 0.04 to 0.07 during the first 30 epochs... For both two baselines DINO and i BOT, the query crop ratio is randomly sampled from 0.05 0.25... For baseline i BOT, we set the masked ratio of global views as 0.3.