Sequential Subset Matching for Dataset Distillation
Authors: JIAWEI DU, Qin Shi, Joey Tianyi Zhou
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our proposed Seq Match outperforms state-of-the-art methods in various datasets, including SVNH, CIFAR-10, CIFAR-100, and Tiny Image Net. Our code is available at https://github.com/shqii1j/seqmatch. Experiments on diverse datasets demonstrate the effectiveness of Seq Match, achieving state-of-the-art performance. |
| Researcher Affiliation | Academia | Jiawei Du , Qin Shi , Joey Tianyi Zhou Centre for Frontier AI Research (CFAR), Agency for Science, Technology and Research (A*STAR), Singapore Institute of High Performance Computing (IHPC), Agency for Science, Technology and Research (A*STAR), Singapore {dujw,Joey_Zhou}@cfar.a-star.edu.sg, shiqin924924@gmail.com |
| Pseudocode | Yes | Algorithm 1 Training with Seq Match in Distillation Phase. |
| Open Source Code | Yes | Our code is available at https://github.com/shqii1j/seqmatch. |
| Open Datasets | Yes | Datasets: We evaluate the performance of dataset distillation methods on several widely-used datasets across various resolutions. MNIST [28]... SVNH [36]... CIFAR10 and CIFAR100 [25]... Tiny Image Net [27]... Image Net [24] subsets... |
| Dataset Splits | No | The paper states 'The optimal value of hyperparameter K is obtained via grid searches within the set {2, 3, 4, 5, 6} in a validation set within the CIFAR-10 dataset.' This confirms the use of a validation set but does not provide specific details on how this split was created (e.g., percentages or sample counts for training, validation, and test sets). |
| Hardware Specification | Yes | We conduct our experiments on the server with four Tesla V100 GPUs. |
| Software Dependencies | No | The paper mentions using Conv Net and Res Net, but does not specify software dependencies like Python, PyTorch/TensorFlow, or CUDA versions. |
| Experiment Setup | Yes | To ensure the reproducibility of Seq Match, we provide detailed implementation specifications. Our method relies on a single hyperparameter, denoted by K, which determines the number of subsets. In order to balance the inclusion of sufficient knowledge in each segment with the capture of high-level features in the later stages, we set K = {2, 3} for the scenarios where ipc = {10, 50}, respectively... Table 4: Hyperparameter values we used for Seq Match-MTT in the main result table. Most of the hyperparameters Max Start Epoch and Synthetic Step are various with the subsets, we use a sequential numbers to denote the parameters used in the corresponding subsets. Img. denotes the abbreviation of Image Net. |