Dataset Condensation with Contrastive Signals

Authors: Saehyung Lee, Sanghyuk Chun, Sangwon Jung, Sangdoo Yun, Sungroh Yoon

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experimental results indicate that while the existing methods are ineffective for fine-grained image classification tasks, the proposed method can successfully generate informative synthetic datasets for the same tasks. Moreover, we demonstrate that the proposed method outperforms the baselines even on benchmark datasets such as SVHN, CIFAR-10, and CIFAR-100.
Researcher Affiliation Collaboration 1Department of Electric and Computer Engineering, Seoul National University 2NAVER AI Lab 3Interdisciplinary Program in Artificial Intelligence, Seoul National University.
Pseudocode Yes Algorithm 1 Dataset condensation with contrastive signals
Open Source Code Yes The code of our study is available at: https://github. com/Saehyung-Lee/DCC
Open Datasets Yes Datasets. We complement our analysis with experiments conducted on SVHN, CIFAR-10, CIFAR-100, and the fine-grained image classification datasets (Automobile, Terrier, Fish, Truck, Insect, and Lizard) subsampled from Image Net32x32 (Chrabaszcz et al., 2017) using the Word Net hierarchy (Miller, 1998).
Dataset Splits No The paper describes using standard benchmark datasets like CIFAR-10 and SVHN, which typically have predefined train/test splits. For example, 'CIFAR-10 (Krizhevsky et al., 2009) consists of 50,000 training images and 10,000 test images in 10 classes.' However, it does not explicitly state the use or size of a separate validation split within the main text for hyperparameter tuning or early stopping, nor does it explicitly mention percentages or counts for such a split.
Hardware Specification No The paper mentions that 'Most experiments were conducted on NAVER Smart Machine Learning (NSML) platform (Sung et al., 2017; Kim et al., 2018),' but does not provide specific hardware details such as GPU or CPU models used for the experiments.
Software Dependencies No The paper mentions the use of certain models (e.g., ConvNet, ResNet-18, VGG-11) and refers to the code for KIP provided by its authors, but it does not specify any software libraries or frameworks with their version numbers (e.g., 'PyTorch 1.9', 'TensorFlow 2.x').
Experiment Setup Yes Implementation Details. In our experiments, we compare the proposed method with the baseline methods for the settings of learning 1, 10, and 50 image(s) per class as in Zhao et al. (2021). We use Conv Net (Gidaris & Komodakis, 2018) as a classifier from which the gradients for matching are obtained in the dataset condensation process. We set Ko = 1000, γo = 250, γi = 10, and τ = 0.1. For settings of learning 1, 10, and 50 image(s) per class, (Ki, T) is set to (10,5), (10,50), and (50,10), respectively.