KGDet: Keypoint-Guided Fashion Detection
Authors: Shenhan Qian, Dongze Lian, Binqiang Zhao, Tong Liu, Bohui Zhu, Hai Li, Shenghua Gao2449-2457
AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | we empirically show that keypoints are important cues to help improve the performance of clothing detection and further design a simple yet effective KGDet model that incorporates keypoint cues into clothing detection; extensive experiments validate the effectiveness of our method as well as the positive correlation between clothing detection and keypoint estimation. The proposed KGDet achieves superior performance on the Deep Fashion2 dataset and FLD dataset with high efficiency. |
| Researcher Affiliation | Collaboration | 1Shanghai Tech University 2Alibaba Group 3Ant Group 4Shanghai Engineering Research Center of Intelligent Vision and Imaging |
| Pseudocode | No | The paper includes architectural diagrams (Figure 2, Figure 3) but no explicitly labeled pseudocode or algorithm blocks with structured steps. |
| Open Source Code | No | The paper does not contain any explicit statement about releasing source code or a direct link to a code repository for the methodology described. |
| Open Datasets | Yes | We evaluate the proposed method on the Deep Fashion2 (Ge et al. 2019) and Fashion Landmark Detection (FLD) (Liu et al. 2016b) dataset. |
| Dataset Splits | Yes | Since only a subset of the dataset is released (192K images for training, 32K for validation, and 63K for test), our experiments are conducted on this publicly available portion. FLD (Liu et al. 2016b) defines 8 keypoints for 3 main types of clothes. There are 83K images for training, 19K for validation, and 19K for test. |
| Hardware Specification | Yes | batch size 8 with 4 NVIDIA P40 GPUs |
| Software Dependencies | No | The paper mentions "the SGD optimizer is employed to train the whole network" but does not specify version numbers for any software libraries, frameworks (e.g., PyTorch, TensorFlow), or programming languages (e.g., Python). |
| Experiment Setup | Yes | We input images with resolution no larger than 1333 800. We train our network with learning rate 5e 3, momentum 0.9, weight decay 1e 4, batch size 8 with 4 NVIDIA P40 GPUs, and the SGD optimizer is employed to train the whole network. We only use randomly horizontal flip as data augmentation. We empirically set λ1 = 0.1 and λ2 = 1 to balance different loss terms. |