Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Distance-informed Neural Processes

Authors: Aishwarya Venkataramanan, Joachim Denzler

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Empirical results demonstrate that DNP achieves strong predictive performance and improved uncertainty calibration across regression and classification tasks. 5 Experiments We evaluate DNP on both regression and classification tasks. For regression, we consider 1D synthetic examples, as well as multi-dimensional tasks on several real-world datasets. Performance is assessed in terms of predictive quality and uncertainty calibration [31]. For classification, we assess the model on CIFAR-10 and CIFAR-100 [30], evaluating classification accuracy, calibration performance, and OOD detection capabilities.
Researcher Affiliation	Academia	Aishwarya Venkataramanan Computer Vision Group Friedrich Schiller University Jena, Germany EMAIL Joachim Denzler Computer Vision Group Friedrich Schiller University Jena, Germany EMAIL
Pseudocode	Yes	A.1 Algorithm The training and inference procedures for DNP is outlined in Algorithms 1 and 2 respectively.
Open Source Code	Yes	Code is available: https://github.com/cvjena/DNP.git. The code will be made open-source. The code for synthetic data generation is included. The real-world datasets for the regression and classification experiments are already available publicly.
Open Datasets	Yes	We evaluate DNP on three benchmark datasets: SARCOS [61], which models the inverse dynamics of a seven-degree-of-freedom robot arm by predicting seven joint torques from 21 inputs (positions, velocities, accelerations); Water Quality (WQ) [7], which predicts species abundances in Slovenian rivers from 16 physical and chemical indicators; and SCM20D [50], a supply-chain time series for multi-item demand forecasting. For classification using CIFAR10, we use the SVHN [46], CIFAR100 and the Tiny Image Net [33] as the OOD datasets. For CIFAR100, the OOD datasets are SVHN, CIFAR10 and Tiny Image Net. The real-world datasets for the regression and classification experiments are already available publicly.
Dataset Splits	Yes	All datasets are standardized to zero mean and unit variance per feature and split 80/20 into training and test sets. For each training batch, the number of context points is sampled uniformly from the interval [3, 50], while the number of target points is fixed at 50. For training, the number of context points is randomly chosen between from a uniform distribution between 16 and 128.
Hardware Specification	Yes	The models were implemented using Py Torch [43] and trained on an Nvidia Ge Force GTX 1080 with 12 GB of RAM. The models were implemented using Py Torch [43] and trained on an NVIDIA A100 GPU with 40 GB of memory.
Software Dependencies	No	The models were implemented using Py Torch [43] and trained on an Nvidia Ge Force GTX 1080 with 12 GB of RAM. To obtain the singular values for DNP, we use Py Torch s implementation of the LOBPCG method, which employs orthogonal basis selection [51].
Experiment Setup	Yes	All the models are trained using the Adam optimizer [26] at a learning rate of 1e 3, for 200 epochs. For DNP, the bi-Lipschitz constraint is enforced with lower and upper singular value bounds λ1 = 0.1, λ2 = 1.0, and the balancing the ELBO and bi-Lipschitz regularization is set to β = 1. All networks were trained with the Adam optimizer [26] at a learning rate of 1e 3, using a batch size of 50 for 200 epochs. Models are trained using Adam optimizer [26] with a learning rate of 1e 3, and a batch size of 100, for 100 epochs. Models are trained using the Adam optimizer [26] with a learning rate of 1e 4 for 200 epochs.