Dataset Distillation using Neural Feature Regression

Authors: Yongchao Zhou, Ehsan Nezhadarya, Jimmy Ba

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We propose an effective method for dataset distillation. Our method, named neural Feature Regression with Pooling (FRe Po), achieves state-of-the-art results on various benchmark datasets with a 100x reduction in training time and a 10x reduction in GPU memory requirement. We compare our method to four state-of-the-art dataset distillation methods [7, 8, 20, 23] on various benchmark datasets [26, 32 37].
Researcher Affiliation Collaboration Yongchao Zhou Department of Computer Science University of Toronto yongchao.zhou@mail.utoronto.ca Ehsan Nezhadarya Toronto AI Lab LG Electronics Canada ehsan.nezhadarya@lge.com Jimmy Ba Department of Computer Science University of Toronto jba@cs.toronto.edu
Pseudocode Yes Algorithm 1 Dataset Distillation using Neural Feature Regression with Pooling (FRe Po)
Open Source Code Yes Our code is available at https://github.com/yongchao97/FRe Po.
Open Datasets Yes We compare our method to four state-of-the-art dataset distillation methods [7, 8, 20, 23] on various benchmark datasets [26, 32 37].
Dataset Splits Yes We train several neural networks parameterized by θ on the dataset S and then compute the validation loss Lp Alg pθ, Sq , T q on the real dataset T. In contrast, at meta-test time, we train a new model from scratch on S and evaluate the trained model on a held-out real dataset.
Hardware Specification No The paper mentions 'GPU memory requirement' but does not provide specific hardware details such as exact GPU/CPU models, processor types, or memory amounts used for experiments.
Software Dependencies No The paper mentions 'Tensorflow Privacy [62]' but does not provide specific version numbers for any software dependencies, libraries, or solvers used in their experiments.
Experiment Setup No The paper states, 'We provide implementation details about data preprocessing, distilled data initialization, and hyperparameters in Appendix ??' and 'various ablation studies regarding the model pool, batch size, distilled data initialization, label learning, and model architectures in Appendix ??', explicitly deferring these specific experimental setup details to the appendix rather than providing them in the main text.