Expanding Small-Scale Datasets with Guided Imagination

Authors: Yifan Zhang, Daquan Zhou, Bryan Hooi, Kai Wang, Jiashi Feng

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Specifically, GIF-SD obtains 13.5% higher model accuracy on natural image datasets than unguided expansion with SD. With these essential criteria, GIF successfully expands small datasets in various scenarios, boosting model accuracy by 36.9% on average over six natural image datasets and by 13.5% on average over three medical datasets.
Researcher Affiliation Collaboration Yifan Zhang1 Daquan Zhou2 Bryan Hooi1 Kai Wang1 Jiashi Feng2 1National University of Singapore 2Byte Dance
Pseudocode Yes Algorithm 1: GIF-DALLE Algorithm
Open Source Code Yes The source code is available at https://github.com/Vanint/Dataset Expansion.
Open Datasets Yes We evaluate GIF on six small-scale natural image datasets and three medical datasets. Natural datasets cover a variety of tasks: object classification (Caltech-101 [18], CIFAR100-Subset [41]), fine-grained classification (Cars [40], Flowers [49], Pets [50]) and texture classification (DTD [10]). Moreover, medical datasets [90] cover a wide range of image modalities, such as breast ultrasound (Breast MNIST), colon pathology (Path MNIST), and Abdominal CT (Organ SMNIST).
Dataset Splits No The paper mentions using 'validation sets' for some medical datasets, but does not explicitly provide the train/test/validation split percentages or exact counts for all datasets in a clearly reproducible manner. For example, it states 'we use the validation sets of Breast MNIST and Path MNIST for experiments instead of training sets', which is an unusual and unclear description of a standard split.
Hardware Specification Yes costing roughly $40 for renting 8 V100 GPUs3.
Software Dependencies No The paper mentions implementing GIF in Py Torch based on specific model versions like 'CLIP VIT-B/32' and 'Stable Diffusion V1-4', but it does not specify software library versions (e.g., 'PyTorch 1.9', 'Python 3.8', 'NumPy 1.20').
Experiment Setup Yes After expansion, we train Res Net-50 [25] from scratch for 100 epochs based on the expanded datasets. During model training, we process images via random resize to 224 224 through bicubic sampling, random rotation, and random flips. If not specified, we use the SGD optimizer with a momentum of 0.9. We set the initial learning rate (LR) to 0.01 with cosine LR decay, except the initial LR of CIFAR100-Subset and Organ SMNIST is 0.1.