A Part Power Set Model for Scale-Free Person Retrieval
Authors: Yunhang Shen, Rongrong Ji, Xiaopeng Hong, Feng Zheng, Xiaowei Guo, Yongjian Wu, Feiyue Huang
IJCAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on the mainstream evaluation datasets, including Market-1501, Duke MTMC-re ID and CUHK03, validate that our method achieves the state-of-the-art performance. |
| Researcher Affiliation | Collaboration | Yunhang Shen1 , Rongrong Ji1,2 , Xiaopeng Hong3,4 , Feng Zheng5 Xiaowei Guo6 , Yongjian Wu6 and Feiyue Huang6 1Fujian Key Laboratory of Sensing and Computing for Smart City, School of Information Science and Engineering, Xiamen University, 361005, China 2Peng Cheng Laborotory, China 3Xi an Jiaotong University, China 4University of Oulu, Finland 5Southern University of Science and Technology 6Tencent Youtu Lab, Tencent Technology (Shanghai) Co., Ltd. |
| Pseudocode | No | The paper does not contain any pseudocode or clearly labeled algorithm blocks. |
| Open Source Code | Yes | Code and models will be made publicly available. |
| Open Datasets | Yes | Market-1501. Market-1501 [Zheng et al., 2015] contains bounding boxes from a person detector, which have been selected based on their intersection-over-union overlap with annotated bounding boxes. It has 1, 501 persons and is split into training/test sets of 12, 936/19, 732 images. Duke MTMC-re ID. Duke MTMC-re ID [Ristani et al., 2016; Zheng et al., 2017] is a subset of Duke-MTMC for person re-ID. It contains 36, 411 annotated images of 1, 812 different identities captured by eight high-resolution cameras. A total of 1, 404 identities are observed by at least two cameras, and the remaining 408 identities are distractors. The training set contains 16, 522 images of 702 identities and the test set contains the other 702 identities. CUHK03. CUHK03 [Li et al., 2014] consists of 14, 097 cropped images from 1, 467 identities. For each identity, images are captured from two cameras, and there are about 5 images for each view. Two ways are used to produce the cropped images, i.e., human annotation and detection DPM. We follow the new training/test protocol, which has 767 identities for training and 700 identities for testing. Datasets named as labelled and detected are both used for training and testing. |
| Dataset Splits | Yes | Market-1501. Market-1501 [Zheng et al., 2015] [...] is split into training/test sets of 12, 936/19, 732 images. Duke MTMC-re ID. Duke MTMC-re ID [Ristani et al., 2016; Zheng et al., 2017] [...] The training set contains 16, 522 images of 702 identities and the test set contains the other 702 identities. CUHK03. CUHK03 [Li et al., 2014] [...] We follow the new training/test protocol, which has 767 identities for training and 700 identities for testing. |
| Hardware Specification | Yes | We use a step strategy with minibatch Stochastic Gradient Descent (SGD) to train the neural networks on a Tesla V100 GPU. |
| Software Dependencies | No | The paper states: 'Our experiments are implemented based on the Caffe2 framework.' However, it does not specify the version number for Caffe2 or any other software dependencies. |
| Experiment Setup | Yes | All images are resized into a resolution of 384 128 by following [Sun et al., 2018]. The training images are augmented with horizontal flipping. We use a step strategy with minibatch Stochastic Gradient Descent (SGD) to train the neural networks on a Tesla V100 GPU. Parameters of the maximum number of epochs, batch size, momentum, weight decay factor and base learning rate are set as 120, 64, 0.9, 0.0005 and 0.01, respectively. The base learning rate is dropped by a half every 10 epochs from epoch 60 to epoch 90. The learning rate for all new layer is set to 10 the base learning rate. The margin in the triplet loss is 1.4 in all our experiments. Multiloss dynamic training [Zheng et al., 2019] is also used. We use the normalized feature for retrieval evaluation. |