Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Hyperbolic Dataset Distillation

Authors: Wenyuan Li, Guang Li, Keisuke Maeda, Takahiro Ogawa, Miki Haseyama

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments on diverse benchmarks, including Fashion-MNIST, SVHN, CIFAR10, CIFAR-100, and Tiny Image Net, demonstrate the effectiveness of our method. Additionally, our model also performs well in cross-architecture experiments. ... Table 1 presents a comparative evaluation of our method against prior approaches on Fashion MNIST [58], SVHN [43], CIFAR-10 [23], and CIFAR-100 [23]. The results for Tiny Image Net [26] are provided in Appendix I. ... We conducted an ablation study on different curvature values K within the DM framework on CIFAR-10.
Researcher Affiliation	Academia	Wenyuan Li Hokkaido University EMAIL Guang Li Hokkaido University EMAIL Keisuke Maeda Hokkaido University EMAIL Takahiro Ogawa Hokkaido University EMAIL Miki Haseyama Hokkaido University EMAIL
Pseudocode	No	The paper describes the methodology and framework in Section 3 and Figure 2, but does not present a dedicated pseudocode block or algorithm steps.
Open Source Code	Yes	The code is available at https://github.com/Guang000/HDD.
Open Datasets	Yes	We evaluated HDD on several standard benchmark datasets, including Fashion-MNIST [58], SVHN [43], CIFAR-10 [23], CIFAR-100 [23], and the larger-scale Tiny Image Net [26].
Dataset Splits	Yes	Fashion MNIST is a drop-in replacement for the classic MNIST dataset, comprising 70,000 grayscale images of size 28 28 pixels across 10 apparel categories (e.g., T-shirt/top, sneaker) with a 60,000/1,000 train/test split. ... SVHN contains approximately 600,000 real-world 32 32 RGB digit crops (0 9) collected from Google Street View images. It is partitioned into training (73,257), testing (26,032), and an extra set of 531131 samples for data augmentation. ... CIFAR-100 (building on CIFAR-10) contains 60,000 32 32 color images in 100 fine classes (600 images each) grouped into 20 coarse superclasses. Each fine class has a 500/100 train/test split, enabling hierarchical and fine-grained classification studies. ... Tiny Image Net ... provides 100,000 images (500 train, 50 val, 50 test per class), offering a mid-scale benchmark between CIFAR and full Image Net. ... Image Woof ... It contains 9,025 training and 3,929 validation images...
Hardware Specification	Yes	All experiments are conducted on one RTX A6000 Ada GPU, except for Section 4.5. ... All experiments in this section are conducted on an RTX 4090.
Software Dependencies	No	The paper mentions using SGD as an optimizer and specific network architectures (Conv Net, Alex Net, VGG11, Res Net18), but does not specify version numbers for any software libraries or dependencies like Python, PyTorch, or TensorFlow.
Experiment Setup	Yes	Our hyperparameter settings follow the design of the DM [68], IDM [70], and Dance [64] architectures. We adopt the differentiable siamese augmentation [67] enhancement method used in prior works. The synthetic dataset is learned using SGD. For DM with HDD, we train for 20,000 iterations, while for IDM with HDD and Dance with HDD, we train for 10,000 iterations. For all experiments, we set the batch size to 256. Additionally, for different experiments, we use distinct hyperbolic curvature K, gradient scaling factor λ, and synthetic image learning rate r, as detailed in Appendix G. ... Appendix G Hyperparameter Details: For different experiments, we use distinct hyperbolic curvature K, gradient scaling factor λ, and synthetic image learning rate r, as shown in Table 7 and Table 8. For the hyperbolic curvature K, we set it between 0.2 and 3. For the gradient scaling factor λ, we refer to the loss in Hilbert space and ensure that the hyperbolic distance loss maintains the same order of magnitude as the Hilbert space loss through λ. We make minor adjustments to the synthetic image learning rate r while respecting the original method.