reproducibilityindex.ai

Dataset Distillation using Neural Feature Regression

Authors: Yongchao Zhou, Ehsan Nezhadarya, Jimmy Ba

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We propose an effective method for dataset distillation. Our method, named neural Feature Regression with Pooling (FRe Po), achieves state-of-the-art results on various benchmark datasets with a 100x reduction in training time and a 10x reduction in GPU memory requirement. We compare our method to four state-of-the-art dataset distillation methods [7, 8, 20, 23] on various benchmark datasets [26, 32 37].
Researcher Affiliation	Collaboration	Yongchao Zhou Department of Computer Science University of Toronto yongchao.zhou@mail.utoronto.ca Ehsan Nezhadarya Toronto AI Lab LG Electronics Canada ehsan.nezhadarya@lge.com Jimmy Ba Department of Computer Science University of Toronto jba@cs.toronto.edu
Pseudocode	Yes	Algorithm 1 Dataset Distillation using Neural Feature Regression with Pooling (FRe Po)
Open Source Code	Yes	Our code is available at https://github.com/yongchao97/FRe Po.
Open Datasets	Yes	We compare our method to four state-of-the-art dataset distillation methods [7, 8, 20, 23] on various benchmark datasets [26, 32 37].
Dataset Splits	Yes	We train several neural networks parameterized by θ on the dataset S and then compute the validation loss Lp Alg pθ, Sq , T q on the real dataset T. In contrast, at meta-test time, we train a new model from scratch on S and evaluate the trained model on a held-out real dataset.
Hardware Specification	No	The paper mentions 'GPU memory requirement' but does not provide specific hardware details such as exact GPU/CPU models, processor types, or memory amounts used for experiments.
Software Dependencies	No	The paper mentions 'Tensorflow Privacy [62]' but does not provide specific version numbers for any software dependencies, libraries, or solvers used in their experiments.
Experiment Setup	No	The paper states, 'We provide implementation details about data preprocessing, distilled data initialization, and hyperparameters in Appendix ??' and 'various ablation studies regarding the model pool, batch size, distilled data initialization, label learning, and model architectures in Appendix ??', explicitly deferring these specific experimental setup details to the appendix rather than providing them in the main text.