High-Order Structure Based Middle-Feature Learning for Visible-Infrared Person Re-identification
Authors: Liuxiang Qiu, Si Chen, Yan Yan, Jing-Hao Xue, Da-Han Wang, Shunzhi Zhu
AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on the SYSU-MM01, Reg DB, and LLCM datasets show that our HOS-Net achieves superior state-of-the-art performance. |
| Researcher Affiliation | Academia | Liuxiang Qiu1,2*, Si Chen2*, Yan Yan1 , Jing-Hao Xue3, Da-Han Wang2, Shunzhi Zhu2 1Xiamen University, China 2 Xiamen University of Technology, China 3 University College London, UK |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks (e.g., clearly labeled 'Algorithm' sections). |
| Open Source Code | Yes | Our code is available at https://github. com/Jaulaucoeng/HOS-Net. |
| Open Datasets | Yes | The SYSU-MM01 dataset (Wu et al. 2020) contains a total of 30,071 VIS images and 15,792 IR images from 491 different identities. The Reg DB dataset (Nguyen et al. 2017) consists of 412 identities, where each identity has 10 VIS images and 10 IR images captured by two overlapping cameras. The LLCM dataset (Zhang and Wang 2023) is captured in low-light environments. |
| Dataset Splits | Yes | For each mini-batch, we randomly choose 8 identities, where 4 VIS images and 4 IR images of each identity are selected. We also randomly split the gallery set of the SYSU-MM01 and LLCM datasets ten times to report the average performance. We randomly divide the Reg DB dataset for training and testing. The above process is repeated ten times and we report the average performance. |
| Hardware Specification | Yes | Our proposed HOS-Net is implemented with the Py Torch on an NVIDIA RTX3090 GPU. |
| Software Dependencies | No | The paper states 'implemented with the Py Torch', but does not provide specific version numbers for PyTorch or other software dependencies. |
| Experiment Setup | Yes | All the images are resized to 256 128 with horizontal flip, random erasing, and channel augmentation for data augmentation (Ye et al. 2021a) during the training phase. For each mini-batch, we randomly choose 8 identities, where 4 VIS images and 4 IR images of each identity are selected. We use the warm-up strategy to update the learning rate from 0.01 to 0.1 at the first 10 epochs. At the 20 and 50 epochs, the learning rates are set to 0.01 and 0.001, respectively. We use SGD as the optimizer and the momentum parameter is set to 0.9. The total number of training epochs is set to 120. The number of hyperedges M is set to 256. λ in Eq. (6) is set to 1.3. |