reproducibilityindex.ai

Towards Lossless Dataset Distillation via Difficulty-Aligned Trajectory Matching

Authors: Ziyao Guo, Kai Wang, George Cazenavette, HUI LI, Kaipeng Zhang, Yang You

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We compare our method with several representative distillation methods including DC (Zhao et al., 2020), DM (Zhao & Bilen, 2023), DSA (Zhao & Bilen, 2021), CAFE (Wang et al., 2022), KIP (Nguyen et al., 2020), FRe Po (Zhou et al., 2022), RCIG (Loo et al., 2023), MTT (Cazenavette et al., 2022), TESLA (Cui et al., 2023), and FTD (Du et al., 2023). The evaluations are performed on several popular datasets including CIFAR-10, CIFAR-100 (Krizhevsky et al., 2009), and Tiny Image Net (Le & Yang, 2015).
Researcher Affiliation	Academia	Ziyao Guo1,3,4 Kai Wang1 George Cazenavette2 Hui Li4 Kaipeng Zhang3 Yang You1 1National University of Singapore 2Massachusetts Institute of Technology 3Shanghai Artificial Intelligence Laboratory 4Xidian University
Pseudocode	Yes	Algorithm 1: Pipeline of our method
Open Source Code	Yes	Code and distilled datasets are available at https://github.com/NUS-HPC-AI-Lab/DATM.
Open Datasets	Yes	The evaluations are performed on several popular datasets including CIFAR-10, CIFAR-100 (Krizhevsky et al., 2009), and Tiny Image Net (Le & Yang, 2015).
Dataset Splits	Yes	Specifically, we train a randomly initialized network on the distilled dataset and then evaluate its performance on the entire validation set of the original dataset.
Hardware Specification	Yes	Our experiments are run on 4 NVIDIA A100 GPUs, each with 80 GB of memory.
Software Dependencies	No	The paper does not provide specific software dependency names with version numbers (e.g., 'Python 3.8, PyTorch 1.9, and CUDA 11.1').
Experiment Setup	Yes	We report the hyper-parameters of our method in Table 6. Additionally, for all the experiments with optimizing soft labels, we set its momentum to 0.9.