Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Detecting Generated Images by Fitting Natural Image Distributions

Authors: Yonggang Zhang, Jun Nie, Xinmei Tian, Mingming Gong, Kun Zhang, Bo Han

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments demonstrate the efficacy of this method. Code is available at https://github.com/tmlr-group/ConV. Comprehensive experiments across various benchmarks for generated image detection demonstrate the effectiveness of the proposed Con V (see Tables 1-7). To further verify the effectiveness of the proposed Con V, we collect images generated by Sora Open AI (2024) and Open Sora Zheng et al. (2024) and compare Con V with baselines. The experiments demonstrate the efficacy and robustness of Con V against variations in generative models (see Table 2).
Researcher Affiliation	Academia	Yonggang Zhang1 The Hong Kong University of Science and Technology Jun Nie2,3 TMLR Group, Hong Kong Baptist University Xinmei Tian3 University of Science and Technology of China Mingming Gong4,6 The University of Melbourne, Australia Kun Zhang5,6 Carnegie Mellon University Bo Han2 TMLR Group, Hong Kong Baptist University
Pseudocode	No	The paper includes mathematical formulations and describes the framework in text and Figure 4, but it does not contain a clearly labeled pseudocode or algorithm block.
Open Source Code	Yes	Code is available at https://github.com/tmlr-group/ConV.
Open Datasets	Yes	Datasets. Following previous work (Chen et al., 2024), we evaluate Con V on several benchmarks: Image Net (Deng et al., 2009), LSUN-BEDROOM (Yu et al., 2015), Gen Image (Zhu et al., 2023b) and DRCT2M (Chen et al., 2024). Detailed dataset description can be found in the Appendix G. IMAGENET. The natural images and generated images can be obtained at https://github.com/layer6ai-labs/dgm-eval. The images are provided by (Stein et al., 2023). LSUN-BEDROOM. The natural images and generated images can be obtained at https://github.com/ layer6ai-labs/dgm-eval. The images are provided by (Stein et al., 2023). Gen Image. The natural images and generated images can be obtained at https://github.com/ Gen Image-Dataset/Gen Image. The images are provided by (Zhu et al., 2023b). DRCT-2M. The natural images of DRCT-2M come from Co Co and can be obtained from https: //cocodataset.org/#download. AI-generated images of DRCT-2M can be obtained from https: //modelscope.cn/datasets/Boking Chen/DRCT-2M/files, which are provided by (Chen et al., 2024).
Dataset Splits	No	The paper mentions various datasets used for evaluation and indicates that for some benchmarks, specific datasets are used as 'training set' (e.g., 'For the Image Net and LSUN-Bedroom benchmarks, the Pro GAN dataset is used as the training set'). However, it does not explicitly provide the specific training, validation, and test splits (e.g., percentages or sample counts) for the benchmark datasets themselves, which are essential for direct reproducibility of data partitioning.
Hardware Specification	Yes	We use python 3.8.16 and Pytorch 1.12.1, and seveal NVIDIA Ge Force RTX-3090 GPU and NVIDIA Ge Force RTX-4090 GPU.
Software Dependencies	Yes	We use python 3.8.16 and Pytorch 1.12.1, and seveal NVIDIA Ge Force RTX-3090 GPU and NVIDIA Ge Force RTX-4090 GPU.
Experiment Setup	Yes	Implementation details. In our experiments, we use the DINOv2 to instantiate f1( ) and common transformation (details are in Appendix E) to realize h( ). To balance detection performance and efficiency, we use DINOv2 Vi T-L/14 in the following experiments. However, to maintain detection efficiency, we set n = 20 in our experiments. For F-Con V, we train a Real NVP (Dinh et al., 2017) on the top of DINOv2 Vi T-L/14, which consists of 2 coupling blocks with fully connected networks as internal functions. The model is trained using Adam W with a learning rate of 1e-5. More detailed implementation information is provided in Appendix F. Appendix F: The model is optimized using the Adam W optimizer with a learning rate of 1 10 5, β1 = 0.9, β2 = 0.99, and a weight decay of 0.01.