Relation Network for Person Re-Identification

Authors: Hyunjong Park, Bumsub Ham11839-11847

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results on standard benchmarks, including the Market1501 (Zheng et al. 2015), Duke MTMC-re ID (Ristani et al. 2016), and CUHK03 (Li et al. 2014), demonstrate the advantage of our approach for person re ID.
Researcher Affiliation Academia Hyunjong Park, Bumsub Ham School of Electrical and Electronic Engineering, Yonsei University {hyunpark, bumsub.ham}@yonsei.ac.kr
Pseudocode No The paper describes the proposed methods using text, mathematical equations, and diagrams, but does not include explicit pseudocode or algorithm blocks.
Open Source Code Yes To encourage comparison and future work, our code and models are available online: https://cvlab-yonsei.github.io/projects/RRID/.
Open Datasets Yes We test our method on the following datasets and compare its performance with the state of the art. 1) The Market1501 dataset (Zheng et al. 2015) ... 2) The CUHK03 dataset (Li et al. 2014) ... 3) The Duke MTMC-re ID (Ristani et al. 2016) ...
Dataset Splits No The paper refers to standard 'training/test splits' for the datasets (Market1501, CUHK03, Duke MTMC-re ID) and provides numbers for training, query, and gallery images, but does not explicitly detail a separate 'validation' dataset split or its size within the text.
Hardware Specification Yes Training our model takes about six, three and eight hours with two NVIDIA Titan Xps for the Market1501, CUHK03, and Duke MTMC-re ID datasets, respectively.
Software Dependencies No All networks are trained end-to-end using Py Torch (Paszke et al. 2017). While PyTorch is mentioned, a specific version number required for reproduction is not provided.
Experiment Setup Yes We resize all images into 384 128 for training. We set the numbers of feature channels C to 2,048 and c to 256. This results in 1,792and 3,840-dimensional features for q P6 and T(q P2, q P4, q P6), respectively. We augment the training datasets with horizontal flipping and random erasing (Zhong et al. 2017b). We use the stochastic gradient descent (SGD) as the optimizer with momentum of 0.9 and weight decay of 5e-4. We train our model with a batch size N of 64 for 80 epochs, where we randomly choose 16 identities and sample 4 person images for each identity (NK = 16, NM = 4). A learning rate initially set to 1e-3 and 1e-2 for the backbone network and other parts, respectively, until 40 epochs is divided by 10 every 20 epochs. We empirically set the weight parameter λ to 2, and fix it to all experiments.