Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Denoising Trajectory Biases for Zero-Shot AI-Generated Image Detection
Authors: Yachao Liang, Min Yu, Gang Li, Jianguo Jiang, Fuqiang Du, Jingyuan Li, Lanchi Xie, Zhen Xu, Weiqing Huang
NeurIPS 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments across a wide range of generators demonstrate that our method achieves significant improvements over state-of-the-art supervised and zero-shot counterparts. Code is available here. To validate the effectiveness of our method, we conducted extensive experiments on multiple datasets of generated images. The evaluation covered a wide range of generative models, including some of the most recent ones. Experimental results demonstrate that our approach exhibits strong generalization ability. |
| Researcher Affiliation | Academia | 1Institute of Information Engineering, Chinese Academy of Sciences 2School of Cyber Security, University of Chinese Academy of Sciences 3Deakin University 4Beijing Technology and Business University 5Institute of Forensic Science, Ministry of Public Security EMAIL |
| Pseudocode | No | The paper describes the method using textual descriptions and mathematical equations (e.g., Equation 1, 2, 3, 4, 5, 6) and diagrams (Figure 2 'Overview of the proposed method'), but does not include any clearly labeled pseudocode or algorithm blocks. |
| Open Source Code | No | Code is available here. Neur IPS Paper Checklist 5. Open access to data and code Question: Does the paper provide open access to the data and code, with sufficient instructions to faithfully reproduce the main experimental results, as described in supplemental material? Answer: [Yes] Justification: We have provided the project demo in the supplementary material, and we will open the source code once our paper gets published. |
| Open Datasets | Yes | Foren Synths [62] contains images generated by various GANs, e.g., Pro GAN [27] and Style GAN [28]. The real images are collected from LSUN [66], Image Net [52], COCO [33], and Celeb A [37]. Gen Image [70] include 8 early text-to-image diffusion datasets, such as Stable Diffusion V1.4 [50] and Glide [41], and the real images are sampled from Image Net [52]. New Generator: Considering the rapid development of generative models, we also test on images generated by several cutting-edge generative models. Specifically, we take the test set of COCO [33] as the real image dataset, then we collect images generated by FLUX [30], Stable Diffusion XL (SDXL) [45], and Stable Diffusion V3 (SD3) [15] with corresponding prompts of real images. We further collected fake images generated by DALLE3 [5], Firefly, and Midjourney-v5 (MJv5) [1] from [4]. |
| Dataset Splits | Yes | Concretely, models are trained on the training set generated by Pro GAN [27], which involves four types of images (cat, chair, car, and horse), and then evaluated on the test set containing other GANs. ... Specifically, we take the test set of COCO [33] as the real image dataset |
| Hardware Specification | Yes | Our experiments are implemented with Py Torch on NVIDIA A100 GPU. |
| Software Dependencies | No | Our experiments are implemented with Py Torch on NVIDIA A100 GPU. We use CLIP Vi T-L/14 to extract features. Our experiments are implemented with Py Torch on NVIDIA A100 GPU. |
| Experiment Setup | Yes | We perform a 50-step DDIM inversion with Stable Diffusion v1.5, before which we crop images to the size of 512 512. We use CLIP Vi T-L/14 to extract features. Our experiments are implemented with Py Torch on NVIDIA A100 GPU. We set the detection threshold as 0.75. |