Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Dataset Distillation of 3D Point Clouds via Distribution Matching

Authors: Jae-Young Yim, Dongwook Kim, Jae-Young Sim

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments on widely used benchmark datasets demonstrate that the proposed method consistently outperforms existing dataset distillation approaches, achieving higher accuracy and strong cross-architecture generalization. ... 4 Experimental Results
Researcher Affiliation Academia Jae-Young Yim , Dongwook Kim , and Jae-Young Sim Ulsan National Institute of Science and Technology EMAIL
Pseudocode No The paper describes the methodology and mathematical formulations but does not include any clearly labeled pseudocode or algorithm blocks. Figure 1 provides a high-level framework diagram, but it is not pseudocode.
Open Source Code No Justification: The paper includes all necessary details to reproduce the main experimental results, and we are planning to release the official code upon publication. ... Justification: We plan to release the implementation code upon publication. The released code will include the full implementation of proposed framework.
Open Datasets Yes The proposed method was evaluated on the Model Net10 [25], Model Net40 [25], Shape Net [4], and Scan Object NN [20] datasets.
Dataset Splits Yes Table 10: Train/test statistics for Model Net10, Model Net40, Shape Net and Scan Object NN. ... # of training samples 3991 # of test samples 908 (Model Net10) ... We use the PB_T50_RS split, which includes perturbed background, translation jitter, rotation, and scaling. (Scan Object NN)
Hardware Specification Yes All experiments were conducted on a single NVIDIA Ge Force RTX 3090 GPU.
Software Dependencies No The paper mentions using Point Net as a backbone for some experiments but does not specify software versions for programming languages, libraries, or other tools.
Experiment Setup Yes Table 11: Hyperparameter settings used for (a) optimizing the synthetic dataset, (b) optimizing the rotation parameters, and (c) evaluation network. ... Implementation Details. We optimized the synthetic dataset S using stochastic gradient descent (SGD) with a learning rate of 0.01, a momentum of 0.5, a weight decay of 0, and a batch size of 8 per class sampled from the original dataset T, while the batch size of the synthetic dataset was set equal to the number of synthetic samples per class (PPC).