Generalizable Person Re-identification via Balancing Alignment and Uniformity
Authors: Yoonki Cho, Jaeyoon Kim, Woo Jae Kim, Junsik Jung, Sung-eui Yoon
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experimental results demonstrate that BAU effectively exploits the advantages of data augmentation, which previous studies could not fully utilize, and achieves state-of-the-art performance without requiring complex training procedures. |
| Researcher Affiliation | Academia | Yoonki Cho Jaeyoon Kim Woo Jae Kim Junsik Jung Sung-Eui Yoon |
| Pseudocode | No | I could not find any structured pseudocode or algorithm blocks within the paper. |
| Open Source Code | Yes | The code is available at https://github.com/yoonkicho/BAU. |
| Open Datasets | Yes | We conduct experiments using the following datasets: Market-1501 [95], MSMT17 [83], CUHK02 [43], CUHK03 [44], CUHK-SYSU [85], PRID [31], GRID [55], VIPe R [26], and i LIDs [97], with dataset statistics shown in Table 1. |
| Dataset Splits | No | Table 2: Evaluation protocols. Setting Training Data Testing Data Protocol-1 Full-(M+C2+C3+CS) PRID, GRID, VIPe R, i LIDs Protocol-2 M+MS+CS C3 M+CS+C3 MS MS+CS+C3 M Protocol-3 Full-(M+MS+CS) C3 Full-(M+CS+C3) MS Full-(MS+CS+C3) M. While detailed training and testing data splits are provided, an explicit 'validation' split with percentages or counts is not mentioned. |
| Hardware Specification | Yes | We implement our framework in Py Torch [64] and utilize two RTX-3090 GPUs for training. |
| Software Dependencies | No | We implement our framework in Py Torch [64] and utilize two RTX-3090 GPUs for training. We train the model for 60 epochs using Adam [38] with a weight decay of 5 10 4. (Software names are mentioned but specific version numbers are not provided.) |
| Experiment Setup | Yes | Following previous studies [34, 50, 51, 86, 90], we use Res Net-50 [29] pre-trained on Image Net [13] with instance normalization layers as our backbone. All images are resized to 256 × 128. For each iteration, we sample 256 images, consisting of 64 identities with 4 instances for each identity. The total batch size during training is 512, including both original and augmented images. Random flipping, cropping, erasing [101], Rand Augment [11], and color jitter are used for data augmentation. We train the model for 60 epochs using Adam [38] with a weight decay of 5 × 10−4. The initial learning rate is set to 3.5 × 10−4 and is decreased by a factor of 10 at the 30th and 50th epochs. A warmup strategy is applied during the first 10 epochs. The momentum μ is set to 0.1. We empirically set the weighting parameter λ to 1.5 and k for the weighting strategy to 10. |