Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
MGD$^3$ : Mode-Guided Dataset Distillation using Diffusion Models
Authors: Jeffrey A Chan Santiago, Praveen Tirupattur, Gaurav Kumar Nayak, Gaowen Liu, Mubarak Shah
ICML 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our approach outperforms stateof-the-art methods, achieving accuracy gains of 4.4%, 2.9%, 1.6%, and 1.6% on Image Nette, Image IDC, Image Net-100, and Image Net-1K, respectively. Our method eliminates the need for fine-tuning diffusion models with distillation losses, significantly reducing computational costs. Our code is available on the project webpage: https://jachansantiago.github.io/modeguided-distillation/ |
| Researcher Affiliation | Collaboration | 1Center for Research in Computer Vision, University of Central Florida, Orlando, Florida, United States 2Mehta Family School of Data Science and Artificial Intelligence, Indian Institute of Technology Roorkee, Roorkee, Uttarakhand, India 3Cisco Research, San Jose, California, United States. Correspondence to: Jeffrey A. Chan-Santiago <EMAIL>. |
| Pseudocode | Yes | Algorithm 1 Mode Guidance with DDIM sampling, given a diffusion model ϵθ(xt), an estimated mode mk and mode guidance scale λ. |
| Open Source Code | Yes | Our code is available on the project webpage: https://jachansantiago.github.io/modeguided-distillation/ |
| Open Datasets | Yes | The datasets we evaluate include Image Net-1K, Image Net-100, Image Net IDC, Image Nette, and Image Net-A to Image Net E. Additionally, we include results from Image Woof in the Appendix E. |
| Dataset Splits | No | The hard-label protocol generates a dataset with its corresponding class labels, trains a network from scratch, and evaluates the network on the original test set. This process is repeated three times for target architectures, and the accuracy mean and standard deviation are reported. Random resize-crop and Cut Mix are applied as augmentation techniques during the target network s training. For more detailed technical information about the protocol, please refer to Gu et al. (2024). |
| Hardware Specification | Yes | We use a single NVIDIA RTX A5000 GPU with 24GB VRAM to run our experiments. |
| Software Dependencies | No | No specific software dependencies with version numbers are explicitly provided in the paper. |
| Experiment Setup | Yes | Implementation details. Our pre-trained model G is Di TXL/2 trained on Image Net, and the image size is 256 x 256. We use the sampling strategy described in Peebles & Xie (2023), which uses 50 sampling steps using classifier-free guidance with a guidance scale of 4.0. For Mode Guidance, we set λ to 0.1, and in our experiments, we use stop guidance tstop = 25. We use K-means to perform mode discovery; we set k = IPC. For the hard-label protocol... We train our model on a synthetic dataset for 1500 epochs for IPC values of 20, 50, and 100, and extend the training to 2000 epochs for an IPC value of 10. We use Stochastic Gradient Descent (SGD) as the optimizer, setting the learning rate at 0.01. We use a learning rate decay scheduler at the 2/3 and 5/6 points of the training process, with the decay factor (gamma) set to 0.2. Cross-entropy was used as the Loss objective. For the soft-label protocol... We train a network for 300 epochs with Resnet-18 architecture as both teacher and student. We use the Adam W optimizer, with a learning rate set at 0.001, a weight decay of 0.01, and the parameters β1 = 0.9 and β2 = 0.999. |