Semi-attention Partition for Occluded Person Re-identification
Authors: Mengxi Jia, Yifan Sun, Yunpeng Zhai, Xinhua Cheng, Yi Yang, Ying Li
AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results confirm that the transformer student achieves substantial improvement after this semi-attention learning scheme, and produces new stateof-the-art accuracy on several standard re-ID benchmarks. |
| Researcher Affiliation | Collaboration | Mengxi Jia1, 3*, Yifan Sun3, Yunpeng Zhai4, Xinhua Cheng4, Yi Yang5, Ying Li2 1 School of Software and Microelectronic, Peking University, Beijing, China 2 National Engineering Center of Software Engineering, Peking University, Beijing, China 3 Baidu Research 4 Peking University, China 5 College of Computer Science and Technology, Zhejiang University, China {mxjia, ypzhai, li.ying}@pku.edu.cn, |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide any statement about releasing source code or a link to a code repository. |
| Open Datasets | Yes | Occluded-Duke MTMC (Miao et al. 2019b) contains 15,618 training images, 17,661 gallery images, and 2,210 occluded query images, which is by far the largest occluded re-ID datasets. |
| Dataset Splits | Yes | Occluded-Duke MTMC...The experiments on this dataset follow the standard setting (Miao et al. 2019b) and the training, query, and gallery sets contain 9%, 100%, and 10% occluded images, respectively. Market-1501...the whole dataset is divided into a training set containing 12,936 images of 751 identities and a testing set containing 19,732 images of 750 identities. MSMT17...The training set has 32,621 images of 1,041 identities, and the testing set has 93,820 images of 3,060 identities. |
| Hardware Specification | Yes | We use 1 NVIDIA A100 GPU for training. |
| Software Dependencies | No | The paper mentions software components like 'ViT-Base', 'SGD optimizer', 'cross-entropy loss', and 'softmax triplet loss' but does not provide specific version numbers for any software libraries or frameworks. |
| Experiment Setup | Yes | In both training and inference stages, the person images are resized to 256 128 and the patch size is 16 16 with 4/5 pixels overlapping for holistic/occluded datasets (He et al. 2021). The training images are augmented with random horizontal flipping, random cropping and random erasing (Zhong et al. 2020) with a probability of 0.5. The batch size is set to 64 and the weight of distillation loss α is set to 1.0 for all datasets. The dropout ratio pdrop is set to 1/10 for MSMT17 and 1/8 to other three datasets. SGD optimizer is adopted with a momentum of 0.9 and the weight decay of 10 4. The learning rate is initialized as 8 10 3 with cosine learning rate decay. |