Robust Video Portrait Reenactment via Personalized Representation Quantization
Authors: Kaisiyuan Wang, Changcheng Liang, Hang Zhou, Jiaxiang Tang, Qianyi Wu, Dongliang He, Zhibin Hong, Jingtuo Liu, Errui Ding, Ziwei Liu, Jingdong Wang
AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Comprehensive experiments have been conducted to validate the effectiveness of our approach. |
| Researcher Affiliation | Collaboration | Kaisiyuan Wang1, Changcheng Liang2, Hang Zhou3*, Jiaxiang Tang4, Qianyi Wu5, Dongliang He3, Zhibin Hong3, Jingtuo Liu3, Errui Ding3, Ziwei Liu6, Jingdong Wang3 1The University of Sydney 2Xidian University 3Baidu Inc. 4Peking University 5Monash University 6S-Lab, Nanyang Technological University |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide an explicit statement about releasing source code or a link to a code repository. |
| Open Datasets | Yes | We evaluate our methods on eight video sequences including five videos from the HDTF (Zhang et al. 2021) dataset, one video from ADNerf (Guo et al. 2021) dataset, one video from LSP (Lu, Chai, and Cao 2021) dataset and one video from Nerface (Gafni et al. 2021) dataset. |
| Dataset Splits | No | The paper mentions training and testing but does not explicitly detail the training/validation/test dataset splits with percentages or counts for reproducibility of data partitioning. It mentions '1000 frames from the test set of each subject' but not the overall dataset size or other splits. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments. |
| Software Dependencies | No | The paper mentions 'implemented on Py Torch' but does not specify software versions for PyTorch or other dependencies. |
| Experiment Setup | Yes | All experiments are implemented on Py Torch using Adam optimizer with an initial learning rate of 5e-4 and batch size of 4. Note that, as we adopt a temporal training strategy, the 4 images in a batch are consecutive frames collected from the same video clip. The training procedure is performed in a self-reenactment manner for both two stages. For both VQGAN (Esser, Rombach, and Ommer 2021) and Vi T (Dosovitskiy et al. 2021), we follow them to use their standard blocks. |