reproducibilityindex.ai

A Label is Worth A Thousand Images in Dataset Distillation

Authors: Tian Qin, Zhiwei Deng, David Alvarez-Melis

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Through a series of ablation experiments, we study the role of soft labels in depth. Our results reveal that the main factor explaining the performance of state-of-the-art distillation methods is not the specific techniques used to generate synthetic data but rather the use of soft labels.
Researcher Affiliation	Collaboration	Tian Qin Harvard University Cambridge, MA tqin@g.harvard.edu Zhiwei Deng Google Deep Mind Mountain View, CA zhiweideng@google.com David Alvarez-Melis Harvard University & MSR Cambridge, MA dam@seas.harvard.edu
Pseudocode	Yes	Algorithm 1 Learn soft label with BPTT
Open Source Code	Yes	Code for all experiments is available at https://github.com/sunnytqin/no-distillation.
Open Datasets	Yes	Table 1: Benchmark SOTA methods against Cut Mix baseline and soft label baseline on Image Net-1K. Table 2: Benchmark SOTA methods against soft label baseline ( Sl baseline") on Tiny Image Net, CIFAR-100 and CIFAR-10.
Dataset Splits	No	The paper does not explicitly specify validation dataset splits or how they were derived from the training data, although it does discuss expert training and hyperparameter tuning which often implies a validation set is used. It references 'standard training recipe' but does not detail the splits.
Hardware Specification	Yes	All experiments are conducted on NVIDIA A100 SXM4 40GB or NVIDIA H100 80GB HBM3.
Software Dependencies	No	The paper mentions 'Py Torch' and cites a paper [21] from 2019 about it, implying a version context from that year. However, it does not provide a specific version number (e.g., 'PyTorch 1.9') for the software dependency.
Experiment Setup	Yes	We follow a standard training recipe to train experts on downsized Image Net-1K, Tiny Image Net, CIFAR-10, and CIFAR-100. This standard training recipe involves an SGD optimizer and a simple step learning rate schedule... Table 7: Hyperparameter list to reproduce soft label baseline results in Table 1 and Table 2.