Learning Disentangled Representation for Robust Person Re-identification

Authors: Chanho Eom, Bumsub Ham

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results demonstrate the effectiveness of IS-GAN, significantly outperforming the state of the art on standard re ID benchmarks including the Market-1501, CUHK03 and Duke MTMC-re ID.
Researcher Affiliation Academia Chanho Eom Bumsub Ham School of Electrical and Electronic Engineering, Yonsei University cheom@yonsei.ac.kr bumsub.ham@yonsei.ac.kr
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code Yes Our code and models are available online: https://cvlab-yonsei.github.io/projects/ISGAN/.
Open Datasets Yes We compare our model to the state of the art on person re ID with the following benchmark datasets: Market-1501 [43], CUHK03 [44] and Duke MTMC-re ID [45].
Dataset Splits Yes Following the standard split [43], we use 12,936 images of 751 identities for training and 19,732 images of 750 identities for testing. The CUHK03 dataset [44] contains 14,096 images of 1,467 identities captured by two cameras. For the training/testing split, we follow the experimental protocol in [46]. The Duke MTMC-re ID dataset [45]... We use the training/test split provided by [45] corresponding 16,522 images of 702 identities for training and 2,228 query and 17,661 gallery images of 702 identities for testing. ...To set other parameters, we randomly split IDs in the training dataset of Market-1501 [43] into 651/100 and used corresponding images as training/validation sets.
Hardware Specification No The paper does not provide specific hardware details (e.g., exact GPU/CPU models, processor types, or memory amounts) used for running its experiments.
Software Dependencies No The paper mentions optimizers (Adam) and network components (ResNet-50, Patch GAN, batch normalization, Leaky ReLU, Dropout) and cites them, but does not provide specific version numbers for any software libraries, frameworks, or programming languages used (e.g., Python, PyTorch, TensorFlow, CUDA).
Experiment Setup Yes A learning rate is set to 2e-4. In the second stage, we fix the baseline, and train the identity-unrelated encoder EU, the generator G, and the discriminators DD and DC with the corresponding losses LU, LS, LPS, LD, and LC. This process iterates for 200 epochs with the learning rate of 2e-4. Finally, we train the whole network end-to-end with the learning rate of 2e-5 for 100 epochs. Following [49], we resize all image into 384 × 128. We augment the datasets with horizontal flipping and random erasing [50]. For mini-batch, we randomly select 4 different identities, and sample a set of 4 images for each identity. We empirically find that training with a large value of λU is unstable. We thus set λU to 0.001 in the second stage, and increase it to 0.01 in the third stage to regularize the disentanglement. Following [26, 35], we fix λS and λD to 10 and 1, respectively. To set other parameters, we randomly split IDs in the training dataset of Market-1501 [43] into 651/100 and used corresponding images as training/validation sets. We use a grid search to set the parameters (λR = 20, λPS = 10, λC = 2) with λR ∈ {5, 10, 20}, λPS ∈ {5, 10, 20}, and λC ∈ {1, 2} on the validation split.