UniAP: Towards Universal Animal Perception in Vision via Few-Shot Learning

Authors: Meiqi Sun, Zhonghan Zhao, Wenhao Chai, Hanjun Luo, Shidong Cao, Yanting Zhang, Jenq-Neng Hwang, Gaoang Wang

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate the effectiveness of Uni AP through comprehensive experiments in pose estimation, segmentation, and classification tasks on diverse animal species, showcasing its ability to generalize and adapt to new classes with minimal labeled examples.
Researcher Affiliation Academia 1Zhejiang University-University of Illinois Urbana Champaign Institute, Zhejiang University 2College of Computer Science and Technology, Zhejiang University 3Electrical and Computer Engineering Department, University of Washington 4Department of Computer Science and Technology, Donghua University 5Shanghai Artificial Intelligence Laboratory
Pseudocode No No pseudocode or clearly labeled algorithm blocks were found in the paper.
Open Source Code No No explicit statement providing concrete access to open-source code for the described methodology was found.
Open Datasets Yes We apply Uni AP on three datasets for animal perception tasks: Animal Kingdom (Ng et al. 2022), Animal Pose (Cao et al. 2019), and APT-36K (Yang et al. 2022) for pose estimation. Additionally, we utilize the Oxford-IIIT Pet dataset (Parkhi et al. 2012) for the segmentation task.
Dataset Splits No No specific details on validation dataset splits (percentages or counts) were provided. The paper mentions: 'We stop the training based on the validation metric threshold.'
Hardware Specification Yes We conduct the experiment using 8 NVIDIA RTX 3090.
Software Dependencies No No specific software versions for dependencies were provided. The paper mentions using 'the Adam optimizer', 'the poly method', and 'BEi TBase backbone' but without version numbers.
Experiment Setup Yes To train our model, we utilize the Adam optimizer (Kingma and Ba 2015) for 1K iterations for warmup and 200K iterations in total. Our learning rate schedule follows the poly (Liu, Rabinovich, and Berg 2015) method with 0.9 decay rate, with base learning rates of 10 5 for pre-trained parameters and 10 4 for parameters trained from scratch. Our global batch size is 64, with different tasks randomly sampled for each batch. We also include prompt and query sets in each batch of a size of 5 for each.