Can We Leave Deepfake Data Behind in Training Deepfake Detector?
Authors: Jikang Cheng, Zhiyuan Yan, Ying Zhang, Yuhao Luo, Zhongyuan Wang, Chen Li
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments confirm that our design allows leveraging forgery information from both blendfake and deepfake effectively and comprehensively. 4 Experiments. In Tab. 1, we provide extensive comparison results with existing state-of-the-art (So TA) deepfake detectors based on Deep Fake Bench [52], where all methods are trained on FF++ (HQ) and tested on other datasets. 4.3 Ablation Study. |
| Researcher Affiliation | Collaboration | Jikang Cheng1 , Zhiyuan Yan2, Ying Zhang3, Yuhao Luo2, Zhongyuan Wang1 , Chen Li3 1 School of Computer Science, Wuhan University 2 The Chinese University of Hong Kong, Shenzhen (CUHK-Shenzhen) 3 We Chat, Tencent Inc. |
| Pseudocode | Yes | Algorithm 1: Training Pro Det |
| Open Source Code | Yes | Code is available at https://github.com/beautyremain/Pro Det. |
| Open Datasets | Yes | Face Forensics++ (FF++) [37] is constructed by four forgery methods including Deepfakes (DF) [15], Face2Face (F2F) [44], Face Swap (FS) [18], and Neural Textures (NT) [43]. FF++ with High Quality (HQ) is employed as the training dataset for all experiments in our paper. The base images to generate blendfake images are also from FF++ (HQ) real. For cross-dataset evaluations, we introduce Celeb-DF-v1 (CDFv1) [29], Celeb-DF-v2 (CDFv2) [29], Deep Fake Detection Challenge Preview (DFDCP) [16], and Deep Fake Detection Challenge (DFDC) [16]. |
| Dataset Splits | No | The paper specifies FF++ (HQ) as the training dataset and other datasets for cross-dataset evaluations (testing), but it does not explicitly provide details about a validation split within these datasets, such as percentages or sample counts. |
| Hardware Specification | Yes | All experiments are conducted on two NVIDIA Tesla V100 GPUs. |
| Software Dependencies | No | The paper mentions software components such as 'Efficient Net B4 [42]', 'Adam optimizer', and 'Dlib [25]', but it does not provide specific version numbers for programming languages, libraries, or frameworks used (e.g., Python, PyTorch, TensorFlow, Dlib version). |
| Experiment Setup | Yes | The trade-off parameters are set to β = 1 and γ = 10. The Adam optimizer is used with a learning rate of 0.0002, epoch of 20, input size of 256 256, and batch size of 24. Feature Bridging is deployed after a warm-up phase of two epochs. |