Test-Time Personalization with a Transformer for Human Pose Estimation
Authors: Yizhuo Li, Miao Hao, Zonglin Di, Nitesh Bharadwaj Gundavarapu, Xiaolong Wang
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We experiment with multiple datasets and show significant improvements on pose estimations with our self-supervised personalization. and We perform our experiments with multiple human pose estimation datasets including Human 3.6M [24], Penn Action [71], and BBC Pose [8] datasets. |
| Researcher Affiliation | Academia | Yizhuo Li Shanghai Jiao Tong University liyizhuo@sjtu.edu.cn Miao Hao UC San Diego mhao@ucsd.edu Zonglin Di UC San Diego zodi@ucsd.edu Nitesh B. Gundavarapu UC San Diego nbgundav@ucsd.edu Xiaolong Wang UC San Diego xiw012@ucsd.edu |
| Pseudocode | No | The paper describes the proposed pipeline and methods in prose and with diagrams (Figure 2), but does not contain a formal pseudocode or algorithm block. |
| Open Source Code | Yes | Project page with code is available at https://liyz15.github.io/TTP/. |
| Open Datasets | Yes | Our experiments are performed on three human pose datasets... Human 3.6M [24]... Penn Action [71]... BBC Pose [8]... |
| Dataset Splits | Yes | Following the standard protocol [74, 34], we used 5 subjects for training and 2 subjects for testing. (Human 3.6M), We use the standard training/testing split (Penn Action), We use 610,115 labeled frames in the first ten videos for training, and we use 2,000 frames in the remaining ten videos (200 frames per video) with manual annotation for testing. (BBC Pose). |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., exact GPU/CPU models, processor types, or cloud instance specifications) used for running its experiments. |
| Software Dependencies | No | The paper mentions using 'Adam [31] optimizer', but does not specify version numbers for programming languages, libraries, or other ancillary software dependencies required for replication. |
| Experiment Setup | Yes | For all datasets, we use batch size of 32, Adam [31] optimizer with learning rate 0.001 and decay the learning rate twice during training. We use learning schedule [18k, 24k, 28k], [246k, 328k, 383k] and [90k, 120k, 140k] for BBC Pose, Penn Action, and Human 3.6M respectively. We divide the learning rate by 10 after each stage. and During Test-Time Personalization, we use Adam optimizer with fixed learning rate 1 10 4. and The weight of self-supervised loss is set to λ = 1 10 3 for Penn Action and BBC Pose, λ = 1 10 5 for Human 3.6M. |