Detect Any Keypoints: An Efficient Light-Weight Few-Shot Keypoint Detector
Authors: Changsheng Lu, Piotr Koniusz
AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Compared to the state of the art, our light-weight detector reduces the number of parameters by 50%, training/test time by 50%, and achieves 5.62% accuracy gain on 1-shot novel keypoint detection in the Animal pose dataset. We evaluate FSKD models using four datasets as follows: Animal pose dataset (Cao et al. 2019a), CUB (Wah et al. 2011), NABird (Van Horn et al. 2015), and Aw A (Banik, Li, and Dong 2021). Table 4: Ablation study. |
| Researcher Affiliation | Collaboration | Changsheng Lu1, Piotr Koniusz2,1 1The Australian National University 2Data61/CSIRO |
| Pseudocode | No | The paper describes the model architecture and components in text and figures but does not provide pseudocode or a clearly labeled algorithm block. |
| Open Source Code | No | The paper mentions 'We provide the detailed architecture of the kernel generator (KG) in Suppl. Mat.1' but does not explicitly state that the source code for their method is available. |
| Open Datasets | Yes | We evaluate FSKD models using four datasets as follows: Animal pose dataset (Cao et al. 2019a) has five mammal species, i.e., cat, dog, cow, horse, and sheep, with over 6000 instances with keypoint annotations. ... Aw A (Banik, Li, and Dong 2021) ... CUB (Wah et al. 2011) ... NABird (Van Horn et al. 2015) |
| Dataset Splits | Yes | CUB (Wah et al. 2011) consists of 200 species with 15 keypoint annotations. We use 100 species for training, 50 for validation, and 50 for testing. NABird (Van Horn et al. 2015) is a larger dataset than CUB with 555 categories, 11 types of annotated body parts, and 48,562 images. The species split is 333, 111, and 111 for training, validation and testing respectively. |
| Hardware Specification | Yes | Table 1: Efficiency test of 1-shot novel keypoint detection on the Animal pose dataset. The mean PCK over 5-subproblems is reported. IT is average keypoint inference time per query image measured in V100. |
| Software Dependencies | No | The paper mentions using ResNet50 as a backbone, but it does not specify software dependencies with version numbers (e.g., Python, PyTorch/TensorFlow versions, or specific library versions used for implementation). |
| Experiment Setup | Yes | The input image size for all models is 384 384 and the backbone of all compared methods uses Res Net50 (He et al. 2016). By default, our non-linear KG uses two resolutions S = {1, 3}, i.e., the group number of kernels and heatmaps is 2. In MFCL, the temperature τ = 0.05. The setting of negative keypoints is (α, ρ, Nneg) = (1.15, 30, 10). By default, we set λ = 0.002. |