Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Collaborative and Confidential Junction Trees for Hybrid Bayesian Networks

Authors: Roberto Gheda, Abele Mălan, Thiago Guzella, Carlo Lancia, Robert Birke, Lydia Chen

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate our method against nine different datasets and report improvements compared to non-hybrid confidentiality-preserving methods. We obtain a 32% average decrease in mean squared error and up to 86 reduction in communication costs. Furthermore, our method uses up to 331 smaller communication costs under purely discrete scenarios.
Researcher Affiliation	Collaboration	Roberto Gheda Delft University of Technology EMAIL Abele Malan University of Neuchâtel EMAIL Thiago Guzella ASML The Netherlands B.V. EMAIL Carlo Lancia ASML The Netherlands B.V. EMAIL Robert Birke University of Turin EMAIL Lydia Y. Chen Delft University of Technology EMAIL
Pseudocode	Yes	Algorithm 1 Collaborative Inference Protocol Algorithm 2 Collaborative Continuous Inference Algorithm 3 Collaborative Discrete Inference
Open Source Code	Yes	Code is available at github.com/r-gheda/hybrid-ccjt.
Open Datasets	Yes	We evaluate Hybrid CCJT on nine publicly available models (see Appendix G for details) whose data structures are hybrid or discrete only. Table 5: Overview of used datasets. Type Dataset #Discrete nodes #Continuous nodes #Arcs #Params Source Hybrid CLG Healthcare 3 4 9 42 [23] Sangiovese 1 14 55 259 [31] Mehra 8 16 71 324423 [32] Child 20 25 230 [33] Alarm 37 46 509 [34] Insurance 27 52 1008 [35] Andes 223 338 1157 [36] Link 724 1125 14211 [37] Munin #2 1003 1244 69431 [38] Continuous Ecoli70 46 70 162 [39] Magic-Niab 44 66 154 [40]
Dataset Splits	No	The paper states: "We sample a dataset from each model. Then, we assign a subset of variables to each party. Each party receives the vertical split of the sampled dataset corresponding to its assigned variables." This describes how variables are distributed among parties for collaborative inference, but does not specify explicit training, validation, or test splits for model evaluation.
Hardware Specification	Yes	Timeout per experiment is set to 24 hours, on 512GB RAM. We mention the amount of system memory used for the experiments, which is the main limiting factor. We did not use any accelerators in our experiments.
Software Dependencies	No	The paper mentions cryptographic schemes like CKKS [20] and refers to Python for implementations in Appendix I, but it does not provide specific version numbers for any libraries, frameworks, or languages used.
Experiment Setup	No	The paper describes how network structures and parameters are learned (via 2-phase Restricted Maximization [28] and Maximum Likelihood Estimator/least squares regression models [29]), and mentions the number of queries and parties for evaluation. However, it does not specify concrete hyperparameters like learning rates, batch sizes, or optimizer settings for the models themselves.