Large Scale Dataset Distillation with Domain Shift

Authors: Noel Loo, Alaa Maalouf, Ramin Hasani, Mathias Lechner, Alexander Amini, Daniela Rus

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We achieve state-of-the-art results on Tiny Image Net, Image Net-1k, and Image Net-21K over a variety of recently proposed baselines, including high cross-architecture generalization. Additionally, our ablation studies provide lessons on the importance of validation-time hyperparameters on distillation performance, motivating the need for standardization.
Researcher Affiliation Collaboration 1MIT CSAIL 2Liquid AI. Correspondence to: Noel Loo <loo@mit.edu>.
Pseudocode Yes The pseudocode for the image synthesis step is provided in Algorithm 1. Pseudocode for the image labeling step, and more detailed pseudocode for image synthesis are available in Appendix A. (Referring to Algorithm 1, Algorithm 2, Algorithm 3, Algorithm 4)
Open Source Code Yes Code available at https://github.com/yolky/d3s_distillation
Open Datasets Yes We consider three tasks, ranging from smaller scale to larger scale. Firstly, we have Tiny-Image Net (Le & Yang, 2015)... As our medium scale dataset, we consider Image Net-1K (Deng et al., 2009)... Finally, we have our most challenging task, Image Net-21K (Ridnik et al., 2021)...
Dataset Splits Yes Table 7. Details of Datasets Tiny-Image Net (Le & Yang, 2015) Image Net-1K (Deng et al., 2009) Image Net-21K (Ridnik et al., 2021) ... Validation Set Size 10,000 50,000 522,500
Hardware Specification Yes All experiments were run on either single 4090s with 24GB VRAM or single RTX 6000 Adas with 48GB VRAM.
Software Dependencies No The paper mentions using the Pytorch library code but does not provide specific version numbers for PyTorch or any other software dependencies required to reproduce the experiments.
Experiment Setup Yes We provide hyperparameters for image synthesis in Table 9. For the augmentation, we use the recently propose curriculum augmentation Yin & Shen (2023)... Table 8. Hyperparameters used for training source models... Table 10. Hyperparameters used for Validation...