Are Large-scale Soft Labels Necessary for Large-scale Dataset Distillation?

Authors: Lingao Xiao, Yang He

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments validate our discoveries. For example, when condensing Image Net-1K to 200 images per class, our approach compresses the required soft labels from 113 GB to 2.8 GB (40 compression) with a 2.6% performance gain. Code is available at: https://github.com/he-y/soft-label-pruning-for-dataset-distillation.
Researcher Affiliation Collaboration 1CFAR, Agency for Science, Technology and Research, Singapore 2IHPC, Agency for Science, Technology and Research, Singapore 3National University of Singapore
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks. Procedures are described in text or illustrated with diagrams (e.g., Figure 4), but not in a pseudocode format.
Open Source Code Yes Code is available at: https://github.com/he-y/soft-label-pruning-for-dataset-distillation.
Open Datasets Yes Dataset. Our experiment results are evaluated on Tiny-Image Net [33], Image Net-1K [34], and Image Net-21K-P [35]. [...] Tiny-Image Net [33] is the subset of Image Net-1K containing 500 images per class of a total of 200 classes, and spatial sizes of images are downsampled to 64 64. Image Net-1K [34] contains 1,000 classes and 1,281,167 images in total. The image sizes are resized to 224 224. Image Net-21K-P [35] is the pruned version of Image Net-21K, containing 10,450 classes and 11,060,223 images in total. Images are sized to 224 224 resolution.
Dataset Splits No The paper mentions using well-known datasets and adhering to previous preprocessing/validation settings (e.g., "For validation, we adhere to the hyperparameter settings of CDA [7]"), but it does not explicitly provide specific training/validation/test dataset splits (e.g., percentages or counts) within its text.
Hardware Specification Yes Experiments are performed on 4 A100 80G GPU cards.
Software Dependencies No The paper mentions using "Pytorch pretrained Res Net-18" and "Timm pretrained model" but does not specify version numbers for PyTorch, Timm, or any other critical software dependencies.
Experiment Setup Yes Appendix C provides detailed hyperparameter settings in tables for different phases and datasets, such as Table 11 "Data Synthesis of Image Net-1K" which lists "Iteration 4,000", "Optimizer Adam", "Image LR 0.25", "Batch Size IPC-dependent", and "BN Loss (α) 0.01".