GPAvatar: Generalizable and Precise Head Avatar from Image(s)

Authors: Xuangeng Chu, Yu Li, Ailing Zeng, Tianyu Yang, Lijian Lin, Yunfei Liu, Tatsuya Harada

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section, we will first introduce the dataset we use, the implementation details of our method, and the baselines of our work. We will then compare our method with existing approaches using a variety of metrics.
Researcher Affiliation Collaboration Xuangeng Chu1,2 Yu Li2 Ailing Zeng2 Tianyu Yang2 Lijian Lin2 Yunfei Liu2 Tatsuya Harada1,3 1The University of Tokyo 2International Digital Economy Academy (IDEA) 3RIKEN AIP {xuangeng.chu, harada}@mi.t.u-tokyo.ac.jp {liyu, zengailing, yangtianyu, linlijian, liuyunfei}@idea.edu.cn
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code Yes Code and demos are available at https://xg-chu.github.io/project_gpavatar. ... The model architectures and training details are introduced in Appendix A.1, and we also release the code for the model at https://github.com/xg-chu/GPAvatar.
Open Datasets Yes We use the VFHQ (Xie et al., 2022) dataset to train our model. ... Regarding evaluation, we assessed our method on the VFHQ dataset (Xie et al., 2022) and the HDTF dataset (Zhang et al., 2021).
Dataset Splits No The paper mentions using VFHQ for training and VFHQ/HDTF for evaluation, but it does not specify a distinct validation set split (e.g., percentages or sample counts) for hyperparameter tuning or model selection.
Hardware Specification Yes We conducted training on 2 NVIDIA Tesla A100 GPUs, with a total batch size of 8.
Software Dependencies No Our framework is built upon the Py Torch framework (Paszke et al., 2017), and during the training process, we employ the ADAM (Kingma & Ba, 2014) optimizer with a learn-ing rate of 1.0e-4.
Experiment Setup Yes Our framework is built upon the Py Torch framework (Paszke et al., 2017), and during the training process, we employ the ADAM (Kingma & Ba, 2014) optimizer with a learn-ing rate of 1.0e-4. We conducted training on 2 NVIDIA Tesla A100 GPUs, with a total batch size of 8. During the training process, our PEF searches for the nearest K=8 points, while MTA selects two frames as source images. Our approach employs an end-to-end training methodology. The training process consists of 150,000 iterations and the full training process consumes approximately 50 GPU hours...