Multi-Scale 3D Convolution Network for Video Based Person Re-Identification
Authors: Jianing Li, Shiliang Zhang, Tiejun Huang8618-8625
AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Evaluations on three widely used benchmarks datasets, i.e., MARS, PRID2011, and i LIDS-VID demonstrate the substantial advantages of our method over existing 3D convolution networks and state-of-art methods. |
| Researcher Affiliation | Academia | Jianing Li, Shiliang Zhang, Tiejun Huang School of Electronics Engineering and Computer Science, Peking University, Beijing 100871, China {ljn-vmc, slzhang.jdl, tjhuang}@pku.edu.cn |
| Pseudocode | No | The paper provides network architectures and mathematical formulations but does not include structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide any explicit statements about the release of source code for the described methodology or a link to a repository. |
| Open Datasets | Yes | We use three video Re ID datasets as our evaluation protocols, including PRID-2011 (Hirzer et al. 2011), i LIDS-VID (Wang and Zhao 2014) and MARS (Zheng et al. 2016). |
| Dataset Splits | No | The paper mentions 'train/test identities' and 'fixed training and testing sets' but does not explicitly describe a separate validation dataset split with specific percentages or counts. |
| Hardware Specification | Yes | All of our experiments are implemented with GTX TITAN X GPU, Intel i7 CPU, and 128GB memory. |
| Software Dependencies | No | Our model is trained and fine-tuned with Py Torch. No specific version numbers for PyTorch or other software dependencies are provided. |
| Experiment Setup | Yes | Input images are resized to 256 128. The initial learning rate is set as 0.001, and is reduced ten times after 10 epoches. The training is finished after 20 epoches. For 3D model training, we sample T adjacent frames from each video sequences as network input in each training epoch, and totally train the 3D models for 400 epoches. For length T = 8, we set the batch size as 24. The batch size is set as 12 for T = 16. The initial learning rate is set as 0.01, and is reduced ten times after 300 epoches. |