Spatial-Temporal Person Re-Identification

Authors: Guangcong Wang, Jianhuang Lai, Peigen Huang, Xiaohua Xie8933-8940

AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section, we evaluate our st-Re ID method on two large-scale person Re ID benchmark datasets, i.e., Market-1501 and Duke MTMC-re ID, and show the superiority of the st-Re ID model compared with other state-of-the-art methods. We then present ablation studies to reveal the benefits of each main component/factor of our method. Without bells and whistles, our st-Re ID method achieves rank-1 accuracy of 98.1% on Market-1501 and 94.4% on Duke MTMC-re ID, improving from the baselines 91.2% and 83.8%, respectively, outperforming all previous state-of-the-art methods by a large margin.
Researcher Affiliation Academia Guangcong Wang,1 Jianhuang Lai,1,2,3 Peigen Huang,1 Xiaohua Xie1,2,3 1School of Data and Computer Science, Sun Yat-sen University, China 2Guangdong Key Laboratory of Information Security Technology 3Key Laboratory of Machine Intelligence and Advanced Computing, Ministry of Education
Pseudocode No The paper describes methods like the Histogram-Parzen approach but does not include any structured pseudocode or algorithm blocks (e.g., labeled as "Algorithm" or "Pseudocode").
Open Source Code No The paper does not contain any explicit statement about providing open-source code for the methodology described, nor does it provide a link to a code repository.
Open Datasets Yes The Market-1501 dataset is collected in front of a supermarket in Tsinghua University. Overall, this dataset contains 32,668 annotated bounding boxes of 1,501 identities. Among them, 12,936 images from 751 identities are used for training, and 19,732 images from 750 identities plus distractors are used for gallery. Duke MTMC-re ID is a subset of the Duke MTMC dataset for image-based re-identification. Specially, 702 IDs are selected as the training set and the remaining 702 IDs are used as the testing set.
Dataset Splits No The paper specifies training and testing splits for the datasets (e.g., "12,936 images from 751 identities are used for training" on Market-1501, and "702 IDs are selected as the training set" on Duke MTMC-re ID), but it does not mention a separate validation set or its size/proportion for model tuning or early stopping.
Hardware Specification No The paper does not provide specific hardware details (e.g., exact GPU/CPU models, memory amounts, or detailed computer specifications) used for running its experiments. It only mentions that the backbone model was "pre-trained on Image Net".
Software Dependencies No The paper mentions using SGD, ResNet, and the PCB method, but it does not provide specific version numbers for any software dependencies (e.g., Python, PyTorch/TensorFlow versions, or specific library versions).
Experiment Setup Yes The training images are augmented with horizontal flip and normalization and resized to 384 192. We use SGD with a mini-batch size of 32. We train the visual feature stream for 60 epochs. The learning rate starts from 0.1 and is decayed to 0.01 after 40 epochs. The backbone model is pre-trained on Image Net and the learning rate for all the pre-trained layers are set to 0.1 of the base learning rate. As for the spatial-temporal stream, we set the time interval t to 100 frames. We set the gaussian kernel parameter σ to 50 and use the three-sigma rule to further reduce the computation. As for the joint metric, we set λ0, λ1, γ0 and γ1 to 1, 2, 5 and 5, respectively.