MobileFaceSwap: A Lightweight Framework for Video Face Swapping
Authors: Zhiliang Xu, Zhibin Hong, Changxing Ding, Zhen Zhu, Junyu Han, Jingtuo Liu, Errui Ding2973-2981
AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our method decreases the computations of Sim Swap and Face Shifter by 146 and 207 times while preserving the visual fidelity. The presented IDN contains only 0.50M parameters and needs 0.33G FLOPs per frame, making it capable for real-time video face swapping on mobile phones. Finally, our method achieves comparable results with the teacher models and other state-of-the-art methods. Experiments Implementation details. The training images are collected from VGGFace2 (Cao et al. 2018). We conduct all ablation studies using Sim Swap as the teacher model to verify the efficiency and necessity of our network architecture designs. The qualitative and quantitative results are shown in Fig. 8 and Table 2, respectively. |
| Researcher Affiliation | Collaboration | Zhiliang Xu1, Zhibin Hong1*, Changxing Ding2, Zhen Zhu3, Junyu Han1, Jingtuo Liu1, Errui Ding1 1 Baidu Inc. 2 South China University of Technology 3 University of Illinois at Urbana-Champaign |
| Pseudocode | No | The paper contains network diagrams and mathematical formulations but does not include any pseudocode or algorithm blocks. |
| Open Source Code | No | The paper mentions that the Sim Swap model is from an "official repository" and that Face Shifter's "source codes are not released". It also links to the Deep Fakes GitHub repository for comparison, but there is no explicit statement or link indicating that the source code for their proposed method (Mobile Face Swap) is publicly available. |
| Open Datasets | Yes | The training images are collected from VGGFace2 (Cao et al. 2018). The test images are collected from Celeb AHQ (Karras et al. 2018). on the Face Forensics++ (Rossler et al. 2019) dataset |
| Dataset Splits | No | The paper states that training images are from VGGFace2 (550K images) and test images from Celeb AHQ and Face Forensics++. However, it does not provide specific train/validation/test split percentages, sample counts for each split, or references to predefined splits with full details to reproduce the data partitioning. |
| Hardware Specification | Yes | our model can achieve real-time face swapping on the mobile phone with Media Tek Dimensity 1100 chip, arriving at 26 FPS. FPS is tested under the mobile phone with Media Tek Dimensity 1100 chip. |
| Software Dependencies | No | The paper mentions several components like VGG network, ResNet-18, ArcFace, and CosFace, but it does not specify software versions for any key libraries (e.g., PyTorch, TensorFlow, CUDA) or other ancillary software components used in their experiments. |
| Experiment Setup | Yes | The image sizes are 224 224 and 256 256 for Sim Swap and Face Shifter, respectively. The total loss is defined as a sum of the above losses. L = Ladv + α(λrec Lrec + λper Lper) +λid Lid + λmask Lmask, (7) where Ladv denotes the GAN loss, and we set λid = 3, λrec = 30, λper = 5, and λmask = 10, respectively. |