reproducibilityindex.ai

TF-CLIP: Learning Text-Free CLIP for Video-Based Person Re-identification

Authors: Chenyang Yu, Xuehu Liu, Yingquan Wang, Pingping Zhang, Huchuan Lu

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments demonstrate that our proposed method shows much better results than other state-of-the-art methods on MARS, LS-VID and i LIDS-VID.
Researcher Affiliation	Academia	School of Information and Communication Engineering, Dalian University of Technology, Dalian, China; School of Computer Science and Artiﬁcial Intelligence, Wuhan University of Technology, Wuhan, China; School of Future Technology, School of Artiﬁcial Intelligence, Dalian University of Technology, Dalian, China; Ningbo Institute, Dalian University of Technology, Ningbo, China
Pseudocode	No	No structured pseudocode or algorithm blocks were found in the paper.
Open Source Code	Yes	The code is available at https://github.com/Asurada Yuci/TF-CLIP.
Open Datasets	Yes	We evaluate our proposed approach on three video-based person Re ID benchmarks, including MARS (Zheng et al. 2016), LS-VID (Li et al. 2019) and i LIDS-VID (Wang et al. 2014).
Dataset Splits	No	The paper describes sampling and mini-batch sizes for training but does not explicitly provide percentages or sample counts for train/validation/test dataset splits across the entire datasets.
Hardware Specification	Yes	Our model is implemented on the Py Torch platform and trained with one NVIDIA Tesla A30 GPU (24G memory).
Software Dependencies	No	The paper mentions 'Py Torch platform' but does not specify a version number or list other software dependencies with versions.
Experiment Setup	Yes	During training, we sample 8 frames from each video sequence and each frame is resized to 256 128. In each mini-batch, we sample 4 identities, each with 4 tracklets. Thus, the number of images in a batch is 4 4 8=128. We also adopt random ﬂipping and random erasing (Zhong et al. 2020) for data augmentation. We train our framework for 60 epochs in total by the Adam optimizer (Kingma and Ba 2014). Following CLIP-Re ID (Li, Sun, and Li 2022), we ﬁrst warm up the model for 10 epochs with a linearly growing learning rate from 5 10 7 to 5 10 6. Then, the learning rate is divided by 10 at the 30th and 50th epochs.