Co-Saliency Spatio-Temporal Interaction Network for Person Re-Identification in Videos
Authors: Jiawei Liu, Zheng-Jun Zha, Xierong Zhu, Na Jiang
IJCAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on two benchmarks have demonstrated the effectiveness of the proposed method. and We conduct extensive experiments to evaluate the proposed CSTNet on two video datasets and compare CSTNet with state-of-the-art methods. Moreover, we investigate the effectiveness of the proposed CSTNet and its components. |
| Researcher Affiliation | Academia | 1University of Science and Technology of China, China 2Capital Normal University, China {jwliu6,zhazj}@ustc.edu.cn, zxr8192@mail.ustc.edu.cn, jiangna@cnu.edu.cn |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide concrete access to source code for the methodology described, nor does it include a specific repository link or explicit code release statement. |
| Open Datasets | Yes | We evaluate the proposed CSTNet on two commonly used video-based person re-identification datasets: MARS and i LIDS-VID. The MARS dataset contains 1,261 identities and a total of 20,715 video sequences captured by 6 cameras... [Zheng et al., 2016]. The i LIDS-VID dataset consists of 600 video sequences of 300 pedestrians... [Wang et al., 2014]. |
| Dataset Splits | No | The paper explicitly mentions 'training' and 'testing' sets but does not specify a 'validation' set split. |
| Hardware Specification | Yes | The implementation of the proposed method is based on the Pytorch framework with two Titan RTX GPUs. |
| Software Dependencies | No | The paper mentions 'Pytorch framework' but does not specify its version number or any other software dependencies with their specific versions. |
| Experiment Setup | Yes | The input video frames are re-scale to the size of 3 256 128 and normalised with 1.0/256. The training set is enlarged by data augmentation strategies including random horizontal flipping and random erasing probability of 0.3. The parameters of CL, C1, H1 and W1 are set to 256, 128, 16 and 8 respectively. Each min-batch contains 16 iden- tities and 4 video clips for each identity. Each video clip samples 8 video frames. The Adam optimizer is adopted with the learning rate lr of 3e 4, the weight decay of 5e 4 and the Nesterov momentum of 0.9. The model is trained for 600 epochs in total. The learning rate lr is multiplied by 0.1 after every 200 epochs. |