IS SYNTHETIC DATA FROM GENERATIVE MODELS READY FOR IMAGE RECOGNITION?

Authors: Ruifei He, Shuyang Sun, Xin Yu, Chuhui Xue, Wenqing Zhang, Philip Torr, Song Bai, XIAOJUAN QI

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this work, we extensively study whether and how synthetic images generated from state-of-the-art text-to-image generation models can be used for image recognition tasks, and focus on two perspectives: synthetic data for improving classification models in data-scarce settings (i.e. zero-shot and fewshot), and synthetic data for large-scale model pre-training for transfer learning. We showcase the powerfulness and shortcomings of synthetic data from existing generative models, and propose strategies for better applying synthetic data for recognition tasks.
Researcher Affiliation Collaboration 1The University of Hong Kong 2University of Oxford 3Byte Dance
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code Yes Code: https://github.com/CVMI-Lab/Synthetic Data.
Open Datasets Yes We select 17 diverse datasets covering object-level (CIFAR-10 and CIFAR-100 ((Krizhevsky et al., 2009), Caltech101 (Fei-Fei et al., 2006), Caltech256 (Griffin et al., 2007), Image Net (Deng et al., 2009)), scene-level (SUN397 (Xiao et al., 2010)), fine-grained (Aircraft (Maji et al., 2013), Birdsnap (Berg et al., 2014), Cars (Krause et al., 2013), CUB (Wah et al., 2011), Flower (Nilsback & Zisserman, 2008), Food (Bossard et al., 2014), Pets (Parkhi et al., 2012)), textures (DTD (Cimpoi et al., 2014)), satelite images (Euro SAT (Helber et al., 2019)) and robustness (Image Net Sketch (Wang et al., 2019), Image Net-R (Hendrycks et al., 2021)) for zero-shot image classification.
Dataset Splits Yes In an N-way M-shot case, we are given M real images of each test class, where M ∈ {1, 2, 4, 8, 16} in our experiments. ... We pre-train the model on the generated synthetic labeled set in a supervised manner, and then perform evaluation after finetuning the model on CIFAR-100.
Hardware Specification No The paper does not provide specific details about the hardware used, such as GPU or CPU models for running its experiments.
Software Dependencies No The paper mentions specific models and frameworks like "T5 model", "CLIP", "Res Net-50", "Moco v2", but does not provide specific version numbers for software dependencies such as Python, PyTorch, or CUDA.
Experiment Setup Yes For synthetic data amount, we generate 2000 (study of synthetic image number in Appendix Sec. B.3) synthetic images for each class in B and LE. For LE, we generate 200 sentences for each class. ... In an N-way M-shot case, we are given M real images of each test class, where M ∈ {1, 2, 4, 8, 16} in our experiments. ... freezing the BN layers yields much better performance.