Cross-video Identity Correlating for Person Re-identification Pre-training
Authors: Jialong Zuo, Ying Nie, Hanyu Zhou, Huaxin Zhang, Haoyu Wang, Tianyu Guo, Nong Sang, Changxin Gao
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct extensive experiments to verify the superiority of our CION in terms of efficiency and performance. CION achieves significantly leading performance with even fewer training samples. |
| Researcher Affiliation | Collaboration | Jialong Zuo 1 Ying Nie 2 Hanyu Zhou 1 Huaxin Zhang 1 Haoyu Wang 2 Tianyu Guo 2 Nong Sang 1 Changxin Gao 1 1 National Key Laboratory of Multispectral Information Intelligent Processing Technology, School of Artificial Intelligence and Automation, Huazhong University of Science and Technology, 2 Huawei Noah s Ark Lab. |
| Pseudocode | No | The paper describes the denoising and self-distillation processes in detail using mathematical formulations and descriptive text, but it does not present them in a clearly labeled 'Pseudocode' or 'Algorithm' block/figure. |
| Open Source Code | No | To address privacy concerns associated with our dataset containing persons and the application of person Re ID technology, we will implement a controlled release of our code, data, and models. Therefore, to ensure private information security, we would not provide open access to our data and code during the paper submission process. We provide training and testing logs in the supplementary materials to ensure the authenticity and validity of our results. Notably, we will publicly share the application link for all our sources after our paper is accepted. |
| Open Datasets | Yes | We conduct experiments on two datasets, i.e., Market1501 [50] and MSMT17 [40]... and in A.4 'SYNTH-PEDES [55]', 'MSMT17 [40]', 'Market1501 [50]'. |
| Dataset Splits | No | The paper states: 'Market1501 and MSMT17 are widely used person Re ID datasets, which contain 32,688 images of 1,501 persons and 126,411 images of 4,101 persons, respectively.' It then describes pre-training and fine-tuning but does not provide explicit train/validation/test splits (e.g., as percentages or counts for each). |
| Hardware Specification | Yes | We pre-train our models on 8 V100 GPUs for 100 epochs. |
| Software Dependencies | No | The paper mentions 'Mind Spore', 'CANN', 'DINO [1]', 'Trans Re ID [20]', and 'C-Contrast [6]' but does not provide specific version numbers for any of these software components. |
| Experiment Setup | Yes | The intra-consisitency criteria σcst and inter-discrimination criteria σdrm are set as 0.2 and 0.18, respectively. We utilize cosine distance to calculate the distance between two samples. The half-width rs of the sliding range is set as 1000. For pre-training, the global views and local views are resized to 256 128 and 128 64, respectively. We pre-train our models on 8 V100 GPUs for 100 epochs. We adopt a curriculum learning strategy for setting the images per identity, where the initial Nid is set to 2, and at epochs 40, 60, and 80, Nid increases to 4, 6, and 8, respectively. To optimize the utilization of GPU memory, the batch size per GPU and the number of cropped local views are adjusted according to the varying parameter counts of the models. All other settings for pre-training, such as learning rate, are consistent with those used in DINO [1]. |