reproducibilityindex.ai

Sequential Subset Matching for Dataset Distillation

Authors: JIAWEI DU, Qin Shi, Joey Tianyi Zhou

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our proposed Seq Match outperforms state-of-the-art methods in various datasets, including SVNH, CIFAR-10, CIFAR-100, and Tiny Image Net. Our code is available at https://github.com/shqii1j/seqmatch. Experiments on diverse datasets demonstrate the effectiveness of Seq Match, achieving state-of-the-art performance.
Researcher Affiliation	Academia	Jiawei Du , Qin Shi , Joey Tianyi Zhou Centre for Frontier AI Research (CFAR), Agency for Science, Technology and Research (ASTAR), Singapore Institute of High Performance Computing (IHPC), Agency for Science, Technology and Research (ASTAR), Singapore {dujw,Joey_Zhou}@cfar.a-star.edu.sg, shiqin924924@gmail.com
Pseudocode	Yes	Algorithm 1 Training with Seq Match in Distillation Phase.
Open Source Code	Yes	Our code is available at https://github.com/shqii1j/seqmatch.
Open Datasets	Yes	Datasets: We evaluate the performance of dataset distillation methods on several widely-used datasets across various resolutions. MNIST [28]... SVNH [36]... CIFAR10 and CIFAR100 [25]... Tiny Image Net [27]... Image Net [24] subsets...
Dataset Splits	No	The paper states 'The optimal value of hyperparameter K is obtained via grid searches within the set {2, 3, 4, 5, 6} in a validation set within the CIFAR-10 dataset.' This confirms the use of a validation set but does not provide specific details on how this split was created (e.g., percentages or sample counts for training, validation, and test sets).
Hardware Specification	Yes	We conduct our experiments on the server with four Tesla V100 GPUs.
Software Dependencies	No	The paper mentions using Conv Net and Res Net, but does not specify software dependencies like Python, PyTorch/TensorFlow, or CUDA versions.
Experiment Setup	Yes	To ensure the reproducibility of Seq Match, we provide detailed implementation specifications. Our method relies on a single hyperparameter, denoted by K, which determines the number of subsets. In order to balance the inclusion of sufficient knowledge in each segment with the capture of high-level features in the later stages, we set K = {2, 3} for the scenarios where ipc = {10, 50}, respectively... Table 4: Hyperparameter values we used for Seq Match-MTT in the main result table. Most of the hyperparameters Max Start Epoch and Synthetic Step are various with the subsets, we use a sequential numbers to denote the parameters used in the corresponding subsets. Img. denotes the abbreviation of Image Net.