reproducibilityindex.ai

Dataset Distillation with Infinitely Wide Convolutional Networks

Authors: Timothy Nguyen, Roman Novak, Lechao Xiao, Jaehoon Lee

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	For instance, using only 10 datapoints (0.02% of original dataset), we obtain over 65% test accuracy on CIFAR10 image classiﬁcation task, a dramatic improvement over the previous best test accuracy of 40%. Our state-of-the-art results extend across many other settings for MNIST, Fashion-MNIST, CIFAR-10, CIFAR-100, and SVHN.
Researcher Affiliation	Industry	Timothy Nguyen Roman Novak Lechao Xiao Jaehoon Lee Deep Mind Google Research, Brain Team timothycnguyen@deepmind.com {romann, xlc, jaehlee}@google.com
Pseudocode	No	The paper describes algorithms (KIP, LS) and a client-server distributed workflow, but does not present any structured pseudocode or clearly labeled algorithm blocks.
Open Source Code	Yes	We open source the distilled datasets, which used thousands of GPU hours, for the research community to further investigate at https://github.com/google-research/google-research/tree/master/kip.
Open Datasets	Yes	We apply the KIP and LS algorithms using the Conv Net architecture on the datasets MNIST [Le Cun et al., 2010], Fashion MNIST [Xiao et al., 2017], SVHN [Netzer et al., 2011], CIFAR-10 [Krizhevsky, 2009], and CIFAR-100.
Dataset Splits	No	The paper mentions using standard datasets like MNIST, Fashion MNIST, SVHN, CIFAR-10, and CIFAR-100, which have predefined train/test splits. However, it does not explicitly state the specific percentages or counts for training, validation, or testing splits used in their experiments, nor does it explicitly mention the use or size of a separate validation set in the provided text.
Hardware Specification	No	The paper mentions a distributed meta-learning framework 'that draws upon hundreds of accelerators per training' and that 'thousands of GPU hours' were used, but it does not specify any particular hardware models (e.g., specific GPU or CPU models, or detailed cloud instance types) for their experiments.
Software Dependencies	No	The paper mentions 'jax.vjp in JAX [Bradbury et al., 2018]' and 'Courier available at https://github.com/deepmind/launchpad'. While JAX and Courier are named software components, specific version numbers for these or other libraries are not provided in the text.
Experiment Setup	Yes	We consider a variety of hyperparameter settings (image preprocessing method, whether to augment target data, and whether to train the support labels for KIP), the full details of which are described in A.