SwiftAvatar: Efficient Auto-Creation of Parameterized Stylized Character on Arbitrary Avatar Engines
Authors: Shizun Wang, Weihong Zeng, Xu Wang, Hao Yang, Li Chen, Chuang Zhang, Ming Wu, Yi Yuan, Yunzhao Zeng, Min Zheng, Jing Liu
AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiments demonstrate the effectiveness and efficiency of Swift Avatar on two different avatar engines. The superiority and advantageous flexibility of Swift Avatar are also verified in both subjective and objective evaluations. |
| Researcher Affiliation | Collaboration | 1 Beijing University of Posts and Telecommunications 2 Douyin Vision {wangshizun, zhangchuang, wuming}@bupt.edu.cn {zengweihong, wangxu.ailab, yang.hao, chenli.phd, yuanyi.cv, zengyunzhao, zhengmin.666, jing.liu}@bytedance.com |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks (e.g., clearly labeled algorithm sections or code-like formatted procedures). |
| Open Source Code | No | The paper does not provide an explicit statement about releasing the source code for the described methodology or a direct link to a code repository. |
| Open Datasets | Yes | For evaluation, we choose 116 images from FFHQ dataset (Karras, Laine, and Aila 2019)... Pretrained Semantic Style GAN on Celeb AMask-HQ (Lee et al. 2020) is directly used as realistic generator Greal... |
| Dataset Splits | No | The paper mentions using a 'training stage' for the avatar estimator and an 'evaluation dataset' for human rating, but it does not provide specific details on training/validation/test dataset splits (e.g., percentages, sample counts, or citations to predefined splits) to reproduce data partitioning. |
| Hardware Specification | Yes | We implement our methods using Py Torch 1.10 library and perform all experiments on NVIDIA V100 GPUs. |
| Software Dependencies | Yes | We implement our methods using Py Torch 1.10 library and perform all experiments on NVIDIA V100 GPUs. |
| Experiment Setup | Yes | Batch size is set to 16, style mixing probability (Karras, Laine, and Aila 2019) is set to 0.3. λR1, λpath are set to 10 and 0.5 separately. Lazy regularization (Karras et al. 2020) is applied every 16 mini-batches for discriminator (R1 regularization) and every 4 mini-batches for generator (path length regularization). All the images used for generators are aligned and resized to resolution 512 512. The optimization-based GAN inversion approach employs Adam (Kingma and Ba 2014) optimizer in the paired data production stage, and the learning rate initially follows co-sine annealing with 0.1. We optimize 200 steps for all latent codes, and λi, λp, λl are set to 0.1, 1 and 1, respectively... For semantic augmentation, we generate 10 augmented images for each latent code, using randomly generated noise in W space. We set λaug to 1 for the background... and also set λaug to 0.3, 0.06 for the hair part and glasses part... In the avatar estimator training stage, the input images of avatar estimator are resized to 224 224. We use the Adam optimizer with batch size 256 to train 100 epochs. The learning rate is set to 1e 3, and decayed by half per 30 epochs. |