An Efficient Dataset Condensation Plugin and Its Application to Continual Learning

Authors: Enneng Yang, Li Shen, Zhenyi Wang, Tongliang Liu, Guibing Guo

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We verify on multiple public datasets that when the proposed plugin is combined with SOTA DC methods, the performance of the network trained on synthetic data is significantly improved compared to traditional DC methods.
Researcher Affiliation Collaboration Enneng Yang1, Li Shen2*, Zhenyi Wang3*, Tongliang Liu4, Guibing Guo1 1Northeastern University, China 2JD Explore Academy, China 3University of Maryland, College Park, USA 4The University of Sydney, Australia
Pseudocode Yes Algorithm 1: Lo DC: Low-rank Dataset Condensation with Gradient Matching
Open Source Code No No explicit statement or link is provided for open-source code release.
Open Datasets Yes We evaluate our low-rank DC plugin on four benchmark datasets as DM [65], including MNIST [29], CIFAR10 [26], CIFAR100 [26], and Tiny Image Net [28].
Dataset Splits No The paper refers to 'large target dataset T' and 'unseen test dataset' but does not explicitly provide percentages or sample counts for training, validation, and test splits, nor does it explicitly state the methodology for these splits (e.g., 80/10/10 split or specific citations for standard splits).
Hardware Specification No The paper does not specify the hardware (e.g., GPU models, CPU types) used for running its experiments.
Software Dependencies No The paper mentions optimizers like SGD and neural network architectures (Conv Net) but does not provide specific version numbers for software libraries or dependencies like Python, PyTorch, or TensorFlow.
Experiment Setup Yes The batch size of the raw images is set to 256 during the matching process. For the DC [66], DSA [64], Lo DC, and Lo DM methods, the outer loop is set to 1 and the inner loop is set to 1 when using 1 image per class in all experimental datasets. When using 10 images per class, the outer loop is set to 10 and the inner loop is set to 50. In the MNIST and CIFAR10 datasets, when using 50 images per class, the outer loop is set to 50 and the inner loop is set to 10. For the DM [65] and Lo DM methods, the loop is set to 20,000 in all experiments. The optimizers all use SGD. When optimizing the condensed data, the learning rate for DC/DSA/Lo DC/Lo DSA is set to 0.1 by default, and the learning rate for DM/Lo DM is set to 1.0 by default. When using condensed data to train the network, the update times of the network are set to 1,000, and the learning rate is set to 0.01.