Learning Commonality, Divergence and Variety for Unsupervised Visible-Infrared Person Re-identification

Authors: Jiangming Shi, Xiangbo Yin, Yachao Zhang, zhizhong zhang, Yuan Xie, Yanyun Qu

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments conducted on the publicly available SYSU-MM01 and Reg DB datasets validate the effectiveness of the proposed method.
Researcher Affiliation Collaboration Jiangming Shi1,3 , Xiangbo Yin2 , Yachao Zhang2, Zhizhong Zhang4,5 Yuan Xie3,4 , Yanyun Qu1,2 1Institute of Artificial Intelligence, Xiamen University 2School of Informatics, Xiamen University 3Shanghai Innovation Institute 4East China Normal University 5Shanghai Key Laboratory of Computer Software Evaluating and Testing
Pseudocode No The paper provides a framework diagram (Figure 1) and describes the methods in detail, but it does not include any structured pseudocode or algorithm blocks.
Open Source Code Yes Code: https://github.com/shijiangming1/PCLHD
Open Datasets Yes We evaluate our method on two common benchmarks in VI-Re ID: SYSU-MM01 [71] and Reg DB [72].
Dataset Splits Yes SYSU-MM01 is a large-scale public benchmark for the VI-Re ID task, which contains 491 identities captured by four RGB cameras and two IR cameras in both outdoor and indoor environments. In this dataset, 22,258 RGB images and 11,909 IR images with 395 identities are collected for training. In the inference stage, the query set consists of 3,803 IR images with 96 identities and the galley set contains 301 randomly selected RGB images. Reg DB is collected by an RGB camera and an IR camera, which contains 4,120 RGB images and 4,120 IR images with 412 identities. To be specific, the dataset is randomly divided into two non-overlapping sets: one set is used for training and the other is for testing.
Hardware Specification No The paper does not specify any particular hardware components such as GPU or CPU models, memory, or cloud computing resources used for the experiments.
Software Dependencies No The paper mentions using a feature extractor from AGW [58] and augmentations from CAJ [1], but it does not specify version numbers for any programming languages, libraries, or specific software dependencies used in the experiments.
Experiment Setup Yes The input images are resized to 288 × 144. In one batch, we randomly sample 16 pseudo identities, and each pseudo identity samples 16 instances. We set M to be 16 for computational convenience. The number of epochs is 100, in which the first 50 epochs are trained by contrastive loss with the centroid prototype. For the last 50 epochs, we train the model by contrastive loss with both the hard and dynamic prototypes. ECPCL is 50. At the beginning of each epoch, we utilize the DBSCAN [50] algorithm to generate pseudo labels. During the inference stage, we use the momentum encoder ϕm to extract features and take the features of the global average pooling layer to calculate cosine similarity for retrieval. The momentum value α and β is set to 0.1 and 0.999, respectively. The temperature hyper-parameter τ is set to 0.05 and the weighting hyper-parameter λ in Eq.(23) is 0.5.