Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Dataset Distillation of 3D Point Clouds via Distribution Matching
Authors: Jae-Young Yim, Dongwook Kim, Jae-Young Sim
NeurIPS 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on widely used benchmark datasets demonstrate that the proposed method consistently outperforms existing dataset distillation approaches, achieving higher accuracy and strong cross-architecture generalization. ... 4 Experimental Results |
| Researcher Affiliation | Academia | Jae-Young Yim , Dongwook Kim , and Jae-Young Sim Ulsan National Institute of Science and Technology EMAIL |
| Pseudocode | No | The paper describes the methodology and mathematical formulations but does not include any clearly labeled pseudocode or algorithm blocks. Figure 1 provides a high-level framework diagram, but it is not pseudocode. |
| Open Source Code | No | Justification: The paper includes all necessary details to reproduce the main experimental results, and we are planning to release the official code upon publication. ... Justification: We plan to release the implementation code upon publication. The released code will include the full implementation of proposed framework. |
| Open Datasets | Yes | The proposed method was evaluated on the Model Net10 [25], Model Net40 [25], Shape Net [4], and Scan Object NN [20] datasets. |
| Dataset Splits | Yes | Table 10: Train/test statistics for Model Net10, Model Net40, Shape Net and Scan Object NN. ... # of training samples 3991 # of test samples 908 (Model Net10) ... We use the PB_T50_RS split, which includes perturbed background, translation jitter, rotation, and scaling. (Scan Object NN) |
| Hardware Specification | Yes | All experiments were conducted on a single NVIDIA Ge Force RTX 3090 GPU. |
| Software Dependencies | No | The paper mentions using Point Net as a backbone for some experiments but does not specify software versions for programming languages, libraries, or other tools. |
| Experiment Setup | Yes | Table 11: Hyperparameter settings used for (a) optimizing the synthetic dataset, (b) optimizing the rotation parameters, and (c) evaluation network. ... Implementation Details. We optimized the synthetic dataset S using stochastic gradient descent (SGD) with a learning rate of 0.01, a momentum of 0.5, a weight decay of 0, and a batch size of 8 per class sampled from the original dataset T, while the batch size of the synthetic dataset was set equal to the number of synthetic samples per class (PPC). |