Dataset Condensation with Contrastive Signals
Authors: Saehyung Lee, Sanghyuk Chun, Sangwon Jung, Sangdoo Yun, Sungroh Yoon
ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experimental results indicate that while the existing methods are ineffective for fine-grained image classification tasks, the proposed method can successfully generate informative synthetic datasets for the same tasks. Moreover, we demonstrate that the proposed method outperforms the baselines even on benchmark datasets such as SVHN, CIFAR-10, and CIFAR-100. |
| Researcher Affiliation | Collaboration | 1Department of Electric and Computer Engineering, Seoul National University 2NAVER AI Lab 3Interdisciplinary Program in Artificial Intelligence, Seoul National University. |
| Pseudocode | Yes | Algorithm 1 Dataset condensation with contrastive signals |
| Open Source Code | Yes | The code of our study is available at: https://github. com/Saehyung-Lee/DCC |
| Open Datasets | Yes | Datasets. We complement our analysis with experiments conducted on SVHN, CIFAR-10, CIFAR-100, and the fine-grained image classification datasets (Automobile, Terrier, Fish, Truck, Insect, and Lizard) subsampled from Image Net32x32 (Chrabaszcz et al., 2017) using the Word Net hierarchy (Miller, 1998). |
| Dataset Splits | No | The paper describes using standard benchmark datasets like CIFAR-10 and SVHN, which typically have predefined train/test splits. For example, 'CIFAR-10 (Krizhevsky et al., 2009) consists of 50,000 training images and 10,000 test images in 10 classes.' However, it does not explicitly state the use or size of a separate validation split within the main text for hyperparameter tuning or early stopping, nor does it explicitly mention percentages or counts for such a split. |
| Hardware Specification | No | The paper mentions that 'Most experiments were conducted on NAVER Smart Machine Learning (NSML) platform (Sung et al., 2017; Kim et al., 2018),' but does not provide specific hardware details such as GPU or CPU models used for the experiments. |
| Software Dependencies | No | The paper mentions the use of certain models (e.g., ConvNet, ResNet-18, VGG-11) and refers to the code for KIP provided by its authors, but it does not specify any software libraries or frameworks with their version numbers (e.g., 'PyTorch 1.9', 'TensorFlow 2.x'). |
| Experiment Setup | Yes | Implementation Details. In our experiments, we compare the proposed method with the baseline methods for the settings of learning 1, 10, and 50 image(s) per class as in Zhao et al. (2021). We use Conv Net (Gidaris & Komodakis, 2018) as a classifier from which the gradients for matching are obtained in the dataset condensation process. We set Ko = 1000, γo = 250, γi = 10, and τ = 0.1. For settings of learning 1, 10, and 50 image(s) per class, (Ki, T) is set to (10,5), (10,50), and (50,10), respectively. |