Heterogeneous Test-Time Training for Multi-Modal Person Re-identification

Authors: Zi Wang, Huaibo Huang, Aihua Zheng, Ran He

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results on benchmark multi-modal Re ID datasets RGBNT201, Market1501-MM, RGBN300, and RGBNT100 validate the effectiveness of the proposed method.
Researcher Affiliation Academia 1School of Computer Science and Technology, Anhui University, Hefei, China 2MAIS & CRIPAC, CASIA, Beijing, China 3Information Materials and Intelligent Sensing Laboratory of Anhui Province, Anhui Provincial Key Laboratory of Multimodal Cognitive Computation, School of Artificial Intelligence, Anhui University, Hefei, China
Pseudocode Yes Algorithm 1: Multi-modal Test-time Training Strategy
Open Source Code Yes The codes can be found at https://github.com/ziwang1121/HTT.
Open Datasets Yes We first introduce multi-modal person Re ID datasets RGBNT201 (Zheng et al. 2021) and Market1501-MM (Wang et al. 2022d), two multi-modal vehicle Re ID datasets RGBNT100 and RGBN300 built by (Li et al. 2020).
Dataset Splits No The paper describes the training phase using source domain data and the evaluation on test sets, but it does not explicitly specify validation dataset splits or how validation was performed with distinct percentages or counts.
Hardware Specification Yes The implementation platform of our method is Pytorch (Paszke et al. 2019) with one RTX 3090Ti GPU.
Software Dependencies No The paper mentions 'Pytorch (Paszke et al. 2019)' as the implementation platform but does not specify its version number or the versions of other software dependencies required for reproducibility.
Experiment Setup Yes The learning rate in the training phase is set to 0.008. The maximum epoch is 80. The batch size is set to 32, consisting of 32 image triplets from four different identities. The weights of (α1, β1) are (0.5, 0.5). The learning rate for test-time training is 0.001. The batch size is set to 16... The balancing hyperparameters (α2, β2) are (1, 1) for RGBNT201 dataset.