Detect Any Keypoints: An Efficient Light-Weight Few-Shot Keypoint Detector

Authors: Changsheng Lu, Piotr Koniusz

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Compared to the state of the art, our light-weight detector reduces the number of parameters by 50%, training/test time by 50%, and achieves 5.62% accuracy gain on 1-shot novel keypoint detection in the Animal pose dataset. We evaluate FSKD models using four datasets as follows: Animal pose dataset (Cao et al. 2019a), CUB (Wah et al. 2011), NABird (Van Horn et al. 2015), and Aw A (Banik, Li, and Dong 2021). Table 4: Ablation study.
Researcher Affiliation Collaboration Changsheng Lu1, Piotr Koniusz2,1 1The Australian National University 2Data61/CSIRO
Pseudocode No The paper describes the model architecture and components in text and figures but does not provide pseudocode or a clearly labeled algorithm block.
Open Source Code No The paper mentions 'We provide the detailed architecture of the kernel generator (KG) in Suppl. Mat.1' but does not explicitly state that the source code for their method is available.
Open Datasets Yes We evaluate FSKD models using four datasets as follows: Animal pose dataset (Cao et al. 2019a) has five mammal species, i.e., cat, dog, cow, horse, and sheep, with over 6000 instances with keypoint annotations. ... Aw A (Banik, Li, and Dong 2021) ... CUB (Wah et al. 2011) ... NABird (Van Horn et al. 2015)
Dataset Splits Yes CUB (Wah et al. 2011) consists of 200 species with 15 keypoint annotations. We use 100 species for training, 50 for validation, and 50 for testing. NABird (Van Horn et al. 2015) is a larger dataset than CUB with 555 categories, 11 types of annotated body parts, and 48,562 images. The species split is 333, 111, and 111 for training, validation and testing respectively.
Hardware Specification Yes Table 1: Efficiency test of 1-shot novel keypoint detection on the Animal pose dataset. The mean PCK over 5-subproblems is reported. IT is average keypoint inference time per query image measured in V100.
Software Dependencies No The paper mentions using ResNet50 as a backbone, but it does not specify software dependencies with version numbers (e.g., Python, PyTorch/TensorFlow versions, or specific library versions used for implementation).
Experiment Setup Yes The input image size for all models is 384 384 and the backbone of all compared methods uses Res Net50 (He et al. 2016). By default, our non-linear KG uses two resolutions S = {1, 3}, i.e., the group number of kernels and heatmaps is 2. In MFCL, the temperature τ = 0.05. The setting of negative keypoints is (α, ρ, Nneg) = (1.15, 30, 10). By default, we set λ = 0.002.