Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Epistemic Uncertainty for Generated Image Detection
Authors: Jun Nie, Yonggang Zhang, Tongliang Liu, Yiu-ming Cheung, Bo Han, Xinmei Tian
NeurIPS 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments across multiple benchmarks show the proposed method surpasses existing methods, highlighting the efficacy of uncertainty in detecting AI-generated images. We conduct full comparative experiments on four benchmarks mentioned. As shown in Table 1, 10, 11 and 9, We Pe achieves good detection performance on Image Net, LSUN-BEDROOM, DRCT-2M and Gen Image. Experimental results show the effectiveness of uncertainty estimation in detecting AI-generated images. |
| Researcher Affiliation | Academia | 1Mo E Key Laboratory of Brain-inspired Intelligent Perception and Cognition, University of Science and Technology of China 2Hong Kong Baptist University 3Sydney AI Centre, The University of Sydney 4The Hong Kong University of Science and Technology |
| Pseudocode | No | The paper describes the methodology using textual explanations and mathematical equations (e.g., Eq. 13, Eq. 16) but does not include a distinct, structured pseudocode or algorithm block. |
| Open Source Code | Yes | Code is available at https://github.com/tmlr-group/We Pe. |
| Open Datasets | Yes | IMAGENET. The natural images and generated images can be obtained at https://github.com/layer6ai-labs/ dgm-eval. The images are provided by (Stein et al., 2023). LSUN-BEDROOM. The natural images and generated images can be obtained at https://github. com/layer6ai-labs/dgm-eval. The images are provided by (Stein et al., 2023). Gen Image. The natural images and generated images can be obtained at https://github.com/ Gen Image-Dataset/Gen Image. The images are provided by (Zhu et al., 2023). DRCT-2M. The natural images of DRCT-2M come from Co Co and can be obtained from https: //cocodataset.org/#download. AI-generated images of DRCT-2M can be obtained from https: //modelscope.cn/datasets/Boking Chen/DRCT-2M/files, which are provided by (Chen et al., 2024). |
| Dataset Splits | No | The paper specifies which datasets are used for training (e.g., Pro GAN dataset for ImageNet/LSUN-BEDROOM, SDv1.4 for Gen Image, SDv2 for DRCT-2M) and testing (various generative models), and mentions using standard benchmarks. However, it does not provide specific split percentages or sample counts for training, validation, or test sets in the context of the natural vs. generated image detection task. It implies the labels are used to set a threshold, but the specific numerical splits are not detailed. |
| Hardware Specification | Yes | We use python 3.8.16 and Pytorch 1.12.1, and several NVIDIA Ge Force RTX-3090 GPU and NVIDIA Ge Force RTX-4090 GPU. |
| Software Dependencies | Yes | We use python 3.8.16 and Pytorch 1.12.1, and several NVIDIA Ge Force RTX-3090 GPU and NVIDIA Ge Force RTX-4090 GPU. |
| Experiment Setup | Yes | In DINOv2 Vi T-L/14, the model has 24 transformer blocks, and we only perturb the parameters of the first 19 blocks with Gaussian perturbations of zero mean. The variance of Gaussian noise is proportional to the mean value of the parameters in each block, with the ratio set to 0.1. For We Pe , we leverage Lo Ra (Hu et al., 2022) for parameter-effcient fine-tuning. The Lora layers are applied on the q_proj and v_proj layers of DINOv2. lora_r and lora_α are set to 8. And the model is optimized using the Adam W optimizer with a learning rate of 1 10 5, β1 = 0.9, β2 = 0.99, and a weight decay of 0.01. Following CNNspot (Wang et al., 2020), data augmentation techniques including JPEG compression and Gaussian blur are employed to enhance robustness. |