Semi-attention Partition for Occluded Person Re-identification

Authors: Mengxi Jia, Yifan Sun, Yunpeng Zhai, Xinhua Cheng, Yi Yang, Ying Li

AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results confirm that the transformer student achieves substantial improvement after this semi-attention learning scheme, and produces new stateof-the-art accuracy on several standard re-ID benchmarks.
Researcher Affiliation Collaboration Mengxi Jia1, 3*, Yifan Sun3, Yunpeng Zhai4, Xinhua Cheng4, Yi Yang5, Ying Li2 1 School of Software and Microelectronic, Peking University, Beijing, China 2 National Engineering Center of Software Engineering, Peking University, Beijing, China 3 Baidu Research 4 Peking University, China 5 College of Computer Science and Technology, Zhejiang University, China {mxjia, ypzhai, li.ying}@pku.edu.cn,
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code No The paper does not provide any statement about releasing source code or a link to a code repository.
Open Datasets Yes Occluded-Duke MTMC (Miao et al. 2019b) contains 15,618 training images, 17,661 gallery images, and 2,210 occluded query images, which is by far the largest occluded re-ID datasets.
Dataset Splits Yes Occluded-Duke MTMC...The experiments on this dataset follow the standard setting (Miao et al. 2019b) and the training, query, and gallery sets contain 9%, 100%, and 10% occluded images, respectively. Market-1501...the whole dataset is divided into a training set containing 12,936 images of 751 identities and a testing set containing 19,732 images of 750 identities. MSMT17...The training set has 32,621 images of 1,041 identities, and the testing set has 93,820 images of 3,060 identities.
Hardware Specification Yes We use 1 NVIDIA A100 GPU for training.
Software Dependencies No The paper mentions software components like 'ViT-Base', 'SGD optimizer', 'cross-entropy loss', and 'softmax triplet loss' but does not provide specific version numbers for any software libraries or frameworks.
Experiment Setup Yes In both training and inference stages, the person images are resized to 256 128 and the patch size is 16 16 with 4/5 pixels overlapping for holistic/occluded datasets (He et al. 2021). The training images are augmented with random horizontal flipping, random cropping and random erasing (Zhong et al. 2020) with a probability of 0.5. The batch size is set to 64 and the weight of distillation loss α is set to 1.0 for all datasets. The dropout ratio pdrop is set to 1/10 for MSMT17 and 1/8 to other three datasets. SGD optimizer is adopted with a momentum of 0.9 and the weight decay of 10 4. The learning rate is initialized as 8 10 3 with cosine learning rate decay.