reproducibilityindex.ai

DenseDINO: Boosting Dense Self-Supervised Learning with Token-Based Point-Level Consistency

Authors: Yike Yuan, Xinghe Fu, Yunlong Yu, Xi Li

IJCAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	4 Experiment We evaluate our method on both classification and semantic segmentation tasks. Table 1: Comparison (%) with other methods. 4.3 Ablation Study
Researcher Affiliation	Academia	Yike Yuan1 , Xinghe Fu1 , Yunlong Yu2 and Xi Li13 1College of Computer Science and Technology, Zhejiang University 2College of Information Science and Electronic Engineering, Zhejiang University 3Zhejiang Singapore Innovation and AI Joint Research Lab, Hangzhou {yuanyike, xinghefu, yuyunlong, xilizju}@zju.edu.cn
Pseudocode	No	The paper does not include structured pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide an explicit statement about releasing its own source code or a link to a code repository for the described methodology.
Open Datasets	Yes	We choose Vi T[Dosovitskiy et al., 2021] as the backbone and Image Net[Russakovsky et al., 2015] as the training dataset. ... For semantic segmentation, we adopt linear probing protocol following [Ziegler and Asano, 2022]. We train a 1 1 convolutional layer on the frozen patch tokens on Pascal VOC 2012[Everingham et al., 2010] train + aug split and report m IOU on the valid split.
Dataset Splits	Yes	We train a 1 1 convolutional layer on the frozen patch tokens on Pascal VOC 2012[Everingham et al., 2010] train + aug split and report m IOU on the valid split.
Hardware Specification	No	The paper does not provide specific hardware details such as GPU or CPU models used for the experiments.
Software Dependencies	No	The paper mentions various models and frameworks (e.g., ViT, DINO, Leopart) but does not provide specific software dependencies with version numbers (e.g., Python, PyTorch versions).
Experiment Setup	Yes	We train Vi T-Small with 4 views and 4 reference points in each pair of views for 300 epochs for the performance comparison in main results, and Vi T-Tiny with 6 views and 4 reference points for 100 epochs for the ablation study. The loss weight α is set as 0.5. ... All models in the experiment are with patch size 16 and trained from scratch unless specified otherwise. Other training parameters are kept the same with the setting of DINO.