Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Beyond Random: Automatic Inner-loop Optimization in Dataset Distillation
Authors: Muquan Li, Hang Gou, Dongyang Zhang, Shuang Liang, Xiurui Xie, Deqiang Ouyang, Ke Qin
NeurIPS 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on CIFAR-10, CIFAR-100, Tiny-Image Net, and Image Net-1K show that AT-BPTT achieves state-of-the-art performance, improving accuracy by an average of 6.16% over baseline methods. |
| Researcher Affiliation | Academia | 1University of Electronic Science and Technology of China 2Chongqing University EMAIL EMAIL EMAIL |
| Pseudocode | Yes | Algorithm 1: Automatic Truncated Backpropagation Through Time |
| Open Source Code | No | We have also ensured reproducibility by providing full methodological details and plan to release code under an open-source license upon publication. |
| Open Datasets | Yes | We select three standard datasets: CIFAR-10 [15] (10 classes, 32 × 32), CIFAR-100 [15] (100 classes, 32 × 32), and Tiny-Image Net [16] (200 classes, 64 × 64). To further show the effectiveness of AT-BPTT on high-resolution images, we scale up the dataset to Image Net-1K [29] (1,000 classes, 224 × 224). |
| Dataset Splits | Yes | We adhere to the conventional procedure adopted in dataset distillation [9]. We select three standard datasets: CIFAR-10 [15] (10 classes, 32 × 32), CIFAR-100 [15] (100 classes, 32 × 32), and Tiny-Image Net [16] (200 classes, 64 × 64). To further show the effectiveness of AT-BPTT on high-resolution images, we scale up the dataset to Image Net-1K [29] (1,000 classes, 224 × 224). For CIFAR-10 and CIFAR-100, we distill datasets with 1, 10, and 50 images per class (IPC = 1, 10, 50), while for Tiny-Image Net and Image Net-1K, we use 1 and 10 images per class (IPC = 1, 10). |
| Hardware Specification | Yes | All experiments are conducted on NVIDIA A800 GPUs with IPC setting of 1, 10, and 50. |
| Software Dependencies | No | The paper mentions 'standardized convolutional neural network [11] (CNN) architectures' and the use of 'Higher framework' for meta-gradient computation and 'Adam optimizer' but does not specify version numbers for any software dependencies. |
| Experiment Setup | Yes | Hyperparameter Settings. In the experiments, several hyperparameters require tuning, as they directly influence the distillation performance of the method. Accordingly, we detail the hyperparameter settings used for performance evaluation in the main text as presented in the Tab. 4. Specifically, window denotes the initial window size, totwindow represents the total number of unrolled time steps, d controls the range of the truncation window size, lr indicates the learning rate, architecture refers to the adopted network architecture, batch size determines the number of training samples used in a single parameter update, and epoch specifies the number of iterations. Table 4: Description of Hyperparameters in Experiments. |