Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
GMValuator: Similarity-based Data Valuation for Generative Models
Authors: Jiaxi Yang, Wenlong Deng, Benlin Liu, Yangsibo Huang, James Y Zou, Xiaoxiao Li
ICLR 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | GMVALUATOR is extensively evaluated on benchmark and high-resolution datasets and various mainstream generative architectures to demonstrate its effectiveness. |
| Researcher Affiliation | Academia | 1University of British Columbia 2University of Washington 3Princeton University 4Stanford University 5Vector Institute |
| Pseudocode | Yes | A concise summary of key notations and Algorithm 1, detailing the pipeline of GMValuator in Sec. A and Sec. B. |
| Open Source Code | Yes | Our code is available at: https://github.com/ubc-tea/GMValuator. |
| Open Datasets | Yes | The generation tasks are conducted on benchmark datasets (i.e., MNIST Le Cun et al. (1998) and CIFAR Krizhevsky et al. (2009)), face recognition dataset (i.e., Celeb A Liu et al. (2018)), high-resolution image dataset with size 512 512, and 1024 1024 (i.e., AFHQ Choi et al. (2020), FFHQ Karras et al. (2019)), the large-scale dataset with 1,000 classes and 14,197,122 images (i.e., Image Net Deng et al. (2009)), and text-to-image dataset (i.e., Naruto Cervenka (2022)). |
| Dataset Splits | Yes | We support this by partitioning a class of CIFAR-10 (the class is plane here) into two non-overlapped subsets, denoted as Xv1 and Xv2.3 Next, we keep Xv1 as non-training data and use Xv2 as training data to train a Big GAN Brock et al. (2018) and generate dataset ˆX. If our assumption holds, the generated data will be more similar to the training data Xv2. |
| Hardware Specification | Yes | GPU One RTX 3080 (10GB) CPU 12 v CPU Intel(R) Xeon(R), Platinum 8255C CPU @ 2.50GHz |
| Software Dependencies | No | The paper mentions using specific tools/libraries like CLIP, MANIQA, LPIPS, Dream Sim, and Product Quantization, but does not provide specific version numbers for these or for the underlying programming languages/frameworks. |
| Experiment Setup | Yes | We report the averaged ρ over the generated datasets (the data size m=100) on different choices of k in Table 2. |