reproducibilityindex.ai

Can We Leave Deepfake Data Behind in Training Deepfake Detector?

Authors: Jikang Cheng, Zhiyuan Yan, Ying Zhang, Yuhao Luo, Zhongyuan Wang, Chen Li

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments confirm that our design allows leveraging forgery information from both blendfake and deepfake effectively and comprehensively. 4 Experiments. In Tab. 1, we provide extensive comparison results with existing state-of-the-art (So TA) deepfake detectors based on Deep Fake Bench [52], where all methods are trained on FF++ (HQ) and tested on other datasets. 4.3 Ablation Study.
Researcher Affiliation	Collaboration	Jikang Cheng1 , Zhiyuan Yan2, Ying Zhang3, Yuhao Luo2, Zhongyuan Wang1 , Chen Li3 1 School of Computer Science, Wuhan University 2 The Chinese University of Hong Kong, Shenzhen (CUHK-Shenzhen) 3 We Chat, Tencent Inc.
Pseudocode	Yes	Algorithm 1: Training Pro Det
Open Source Code	Yes	Code is available at https://github.com/beautyremain/Pro Det.
Open Datasets	Yes	Face Forensics++ (FF++) [37] is constructed by four forgery methods including Deepfakes (DF) [15], Face2Face (F2F) [44], Face Swap (FS) [18], and Neural Textures (NT) [43]. FF++ with High Quality (HQ) is employed as the training dataset for all experiments in our paper. The base images to generate blendfake images are also from FF++ (HQ) real. For cross-dataset evaluations, we introduce Celeb-DF-v1 (CDFv1) [29], Celeb-DF-v2 (CDFv2) [29], Deep Fake Detection Challenge Preview (DFDCP) [16], and Deep Fake Detection Challenge (DFDC) [16].
Dataset Splits	No	The paper specifies FF++ (HQ) as the training dataset and other datasets for cross-dataset evaluations (testing), but it does not explicitly provide details about a validation split within these datasets, such as percentages or sample counts.
Hardware Specification	Yes	All experiments are conducted on two NVIDIA Tesla V100 GPUs.
Software Dependencies	No	The paper mentions software components such as 'Efficient Net B4 [42]', 'Adam optimizer', and 'Dlib [25]', but it does not provide specific version numbers for programming languages, libraries, or frameworks used (e.g., Python, PyTorch, TensorFlow, Dlib version).
Experiment Setup	Yes	The trade-off parameters are set to β = 1 and γ = 10. The Adam optimizer is used with a learning rate of 0.0002, epoch of 20, input size of 256 256, and batch size of 24. Feature Bridging is deployed after a warm-up phase of two epochs.