Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Learning Real Facial Concepts for Independent Deepfake Detection

Authors: Ming-Hui Liu, Harry Cheng, Tianyi Wang, Xin Luo, Xin-Shun Xu

IJCAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	extensive experiments on five widely used datasets demonstrate that Real ID significantly outperforms existing state-of-the-art methods, achieving a 1.74% improvement in average accuracy.
Researcher Affiliation	Academia	1School of Software, Shandong University 2School of Computing, National University of Singapore 3College of Computing and Data Science, Nanyang Technological University
Pseudocode	No	The paper describes methods using text and mathematical equations but does not include any clearly labeled pseudocode or algorithm blocks.
Open Source Code	No	The paper does not contain an explicit statement about releasing code, nor does it provide a link to a code repository.
Open Datasets	Yes	We trained our model with the FF++ dataset [R ossler et al., 2019]. This dataset includes 1,000 real videos from You Tube, as well as five types of manipulated videos yielding a total of 6,000 videos. Finally, to evaluate the generalization capability of our model, we performed the cross-dataset testing on five widely used deepfake datasets, i.e., Celeb-DF [Li et al., 2020b], DFD [Dufour and Gully, 2020], DFDC [Dolhansky et al., 2020], DFDCp [Dolhansky et al., 2020], and UADFV [Li et al., 2018].
Dataset Splits	Yes	Similar to the common setup for generalizable deepfake detection [Wang and Deng, 2021; Fei et al., 2022; Cao et al., 2022], we trained our model with the FF++ dataset [R ossler et al., 2019]. This dataset includes 1,000 real videos from You Tube, as well as five types of manipulated videos yielding a total of 6,000 videos. Finally, to evaluate the generalization capability of our model, we performed the cross-dataset testing on five widely used deepfake datasets, i.e., Celeb-DF [Li et al., 2020b], DFD [Dufour and Gully, 2020], DFDC [Dolhansky et al., 2020], DFDCp [Dolhansky et al., 2020], and UADFV [Li et al., 2018].
Hardware Specification	Yes	We conducted the experiments on a single RTX 3090 GPU with a batch size of 16.
Software Dependencies	No	The paper mentions Dlib2, Efficient Net, and ViT as tools or backbones but does not provide specific version numbers for these or any other software dependencies.
Experiment Setup	Yes	We conducted the experiments on a single RTX 3090 GPU with a batch size of 16. The backbone we employed is Efficient Net [Tan and Le, 2019]... The hyperparameters λ1, λ2, and λ3 in Equation (15) are selected via grid search and set to 0.6, 1.0, and 1.0, respectively.