Decompose to Generalize: Species-Generalized Animal Pose Estimation

Authors: Guangrui Li, Yifan Sun, Zongxin Yang, Yi Yang

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results show that all these decomposition manners yield reasonable joint concepts and substantially improve cross-species generalization (and the attentionbased approach is the best).
Researcher Affiliation Collaboration Guangrui Li1,2 , Yifan Sun2, Zongxin Yang3, Yi Yang3 1Re LER, AAII, University of Technology Sydney. 2Baidu Inc. 3 CCAI, College of Computer Science and Technology, Zhejiang University.
Pseudocode Yes Algorithm A Py Torch-style pseudocode for pixel-to-concept attention
Open Source Code No The proposed method is reproducible. We have provided the Py Torch-style code for the proposed attention module, and the detailed training strategies in the main text and the appendix. However, no specific link to an open-source code repository is provided.
Open Datasets Yes We evaluate our method on two large-scale animal datasets. AP-10K (Yu et al., 2021) is a large-scale benchmark for mammal animal pose estimation... Animal Pose Dataset (Cao et al., 2019) collects and annotates 5 species... Animal Kingdom (Ng et al., 2022) is another dataset...
Dataset Splits Yes We perform domain generalization with the leave-one-out setting, i.e., selecting one species / family as the target domain and the rest as the source domains.
Hardware Specification No No specific hardware details (e.g., GPU models, CPU models, memory specifications) used for running the experiments are provided in the paper. Table 9 presents FLOPs and inference speed but does not specify the hardware used for these measurements.
Software Dependencies No The paper mentions 'Py Torch-style pseudocode' in Algorithm A, but it does not specify any software dependencies with version numbers (e.g., Python version, PyTorch version, specific library versions).
Experiment Setup Yes The batch size is set to 64, and the learning rate for the first stage and the second is set to 5 × 10−4 and 5 × 10−5, respectively. We optimize the model with Adam for 210 epochs, where the learned rate decrease (×10−1) at 170 and 200, respectively. The size of the input image is 256 × 256 and the heatmap is with size 64 × 64. The number of the concept-specific blocks is set to 2, and k is set to 3 for all transfer tasks.