reproducibilityindex.ai

Modality-Adaptive Mixup and Invariant Decomposition for RGB-Infrared Person Re-identification

Authors: Zhipeng Huang, Jiawei Liu, Liang Li, Kecheng Zheng, Zheng-Jun Zha1034-1042

AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experimental results on two challenging benchmarks demonstrate superior performance of MID over stateof-the-art methods.
Researcher Affiliation	Academia	1 University of Science and Technology of China 2 Institute of Computing Technology, Chinese Academy of Sciences {hzp1104,zkcys001}@mail.ustc.edu.cn, {jwliu6,zhazj}@ustc.edu.cn, liang.li@ict.ac.cn
Pseudocode	No	The paper describes the proposed methods mathematically and descriptively but does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks.
Open Source Code	No	The paper does not contain any statements about releasing open-source code or provide links to a code repository.
Open Datasets	Yes	We evaluate the proposed MID using two public RGB-Infrared datasets: Reg DB (Nguyen et al. 2017) and SYSU-MM01 (Wu et al. 2017).
Dataset Splits	Yes	Reg DB dataset contains 412 pedestrians. Each pedestrian has 10 visible images and 10 thermal images. Following the evaluation protocol (Ye et al. 2018a,b), this dataset is randomly split into two parts, 206 identities for training and the other 206 identities for testing, with two different testing modes, i.e., visible to thermal mode and thermal to visible mode. The reported results are the average of 10 random training/test splits on Reg DB dataset. SYSU-MM01 (Wu et al. 2017) is the largest existing RGB-infrared dataset, which was captured with 4 visible and 2 infrared cameras. The training set contains 395 persons with 22,258 RGB images and 11,909 IR images, while the testing set contains 96 persons with 3,803 IR images and 301 RGB images.
Hardware Specification	Yes	The proposed method is implemented by the Py Torch framework with one NVIDIA Tesla V100 GPU.
Software Dependencies	No	The paper mentions 'Py Torch framework' but does not specify its version or any other software dependencies with version numbers.
Experiment Setup	Yes	Each mini-batch contains 96 images of 8 identities (each person has 4 RGB images, 4 IR images, and 4 generated mixed modality images). Res Net-50 (He et al. 2016) model is adopted as the backbone network. Part-pooling (Sun et al. 2018) is added after the backbone. The ﬁrst three residual blocks of Res Net-50 model are equipped with modality-adaptive convolution decomposition. The stride of the last convolution layer is set to 1. The margin ρ is set to 0.3. The parameter µ and ξ are set to 1 and 0.1, respectively. The trade-off parameters λ1,4,5 are set to 1, λ2,3 are set to 0.5, and λ6 is set to 0.1 in Eq. (8). We adopt Adam Optimizer to train the actor-critic agent. And we utilize the stochastic gradient descent (SGD) optimizer for MACD with the momentum of 0.9, the initial learning rate of 0.05, 0.02 on Reg DB and SYSU-MM01 datasets, respectively. The learning rates decayed by 0.1 after 20 and 45 epochs. The whole MID framework is trained for 60 epochs on Reg DB dataset which takes 1 hour, and for 100 epochs on SYSU-MM01 dataset which takes 6 hours.