Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Spot the Fake: Large Multimodal Model-Based Synthetic Image Detection with Artifact Explanation

Authors: Siwei Wen, junyan ye, Peilin Feng, Hengrui Kang, Zichen Wen, Yize Chen, Jiang Wu, wenjun wu, Conghui He, Weijia Li

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive evaluations across multiple datasets confirm the superiority of Fake VLM in both authenticity classification and artifact explanation tasks, setting a new benchmark for synthetic image detection. The code, model weights, and dataset can be found here: https://github.com/opendatalab/Fake VLM. 1 Introduction As AI-generated content technologies advance, synthetic images are increasingly integrated into our daily lives [1, 2, 3, 4]. ... 5 Experiment In this section, we introduce three additional datasets used in the experiments, alongside Fake Clue, and describe our experimental setup. We then present Fake VLM s performance on general synthetic and Deep Fake detection tasks, as well as its ability to explain image artifacts. Finally, we conduct ablation studies and further exploratory experiments to assess the model s performance.
Researcher Affiliation Academia 1Shanghai Artificial Intelligence Laboratory, 2Sun Yat-Sen University, 3Beijing Advanced Innovation Center for Future Blockchain and Privacy Computing, Beihang University, 4Shanghai Jiao Tong University, 5The Chinese University of Hong Kong, Shenzhen
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks. It includes diagrams like Figure 1 ('Construction pipeline of Fake Clue dataset') and Figure 2 ('Overview of Fake VLM') but no textual pseudocode.
Open Source Code Yes The code, model weights, and dataset can be found here: https://github.com/opendatalab/Fake VLM.
Open Datasets Yes Additionally, we present Fake Clue, a comprehensive dataset containing over 100,000 images across seven categories, annotated with fine-grained artifact clues in natural language. ... The code, model weights, and dataset can be found here: https://github.com/opendatalab/Fake VLM. ... For open synthetic datasets, we extracted approximately 80K data from Gen Image [54], FF++ [55] and Chameleon [56], maintaining a 1:1 ratio of fake to real data.
Dataset Splits Yes Training/test sets are randomly split; the test set contains 5,000 diverse image samples. Detailed dataset information is provided in the supplementary materials. ... For FF++ and DD-VQA, we use their default training-test splits for evaluation.
Hardware Specification Yes The training is conducted for two epochs on eight NVIDIA A100 GPUs with a batch size of 32 per GPU using a 2e-5 learning rate with 3% linear warmup and cosine decay.
Software Dependencies No The paper mentions using LLaVA-1.5 7B and Vicuna-v1.5-7B as base models, but it does not provide specific version numbers for ancillary software components like programming languages, libraries, or frameworks (e.g., Python, PyTorch, CUDA versions).
Experiment Setup Yes The training is conducted for two epochs on eight NVIDIA A100 GPUs with a batch size of 32 per GPU using a 2e-5 learning rate with 3% linear warmup and cosine decay. This full fine-tuning adapted the model to synthetic data detection/explanation nuances while preserving its general instruction-following capabilities.