Are Large-scale Soft Labels Necessary for Large-scale Dataset Distillation?
Authors: Lingao Xiao, Yang He
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments validate our discoveries. For example, when condensing Image Net-1K to 200 images per class, our approach compresses the required soft labels from 113 GB to 2.8 GB (40 compression) with a 2.6% performance gain. Code is available at: https://github.com/he-y/soft-label-pruning-for-dataset-distillation. |
| Researcher Affiliation | Collaboration | 1CFAR, Agency for Science, Technology and Research, Singapore 2IHPC, Agency for Science, Technology and Research, Singapore 3National University of Singapore |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. Procedures are described in text or illustrated with diagrams (e.g., Figure 4), but not in a pseudocode format. |
| Open Source Code | Yes | Code is available at: https://github.com/he-y/soft-label-pruning-for-dataset-distillation. |
| Open Datasets | Yes | Dataset. Our experiment results are evaluated on Tiny-Image Net [33], Image Net-1K [34], and Image Net-21K-P [35]. [...] Tiny-Image Net [33] is the subset of Image Net-1K containing 500 images per class of a total of 200 classes, and spatial sizes of images are downsampled to 64 64. Image Net-1K [34] contains 1,000 classes and 1,281,167 images in total. The image sizes are resized to 224 224. Image Net-21K-P [35] is the pruned version of Image Net-21K, containing 10,450 classes and 11,060,223 images in total. Images are sized to 224 224 resolution. |
| Dataset Splits | No | The paper mentions using well-known datasets and adhering to previous preprocessing/validation settings (e.g., "For validation, we adhere to the hyperparameter settings of CDA [7]"), but it does not explicitly provide specific training/validation/test dataset splits (e.g., percentages or counts) within its text. |
| Hardware Specification | Yes | Experiments are performed on 4 A100 80G GPU cards. |
| Software Dependencies | No | The paper mentions using "Pytorch pretrained Res Net-18" and "Timm pretrained model" but does not specify version numbers for PyTorch, Timm, or any other critical software dependencies. |
| Experiment Setup | Yes | Appendix C provides detailed hyperparameter settings in tables for different phases and datasets, such as Table 11 "Data Synthesis of Image Net-1K" which lists "Iteration 4,000", "Optimizer Adam", "Image LR 0.25", "Batch Size IPC-dependent", and "BN Loss (α) 0.01". |