Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Computing Divergences between Discrete Decomposable Models

Authors: Loong Kuan Lee, Nico Piatkowski, François Petitjean, Geoffrey I. Webb

AAAI 2023 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Runtime Comparison with mcgo Recall that a method already exists for computing the KL divergence between 2 BNs (Moral, Cano, and G omez-Olmedo 2021) which we will refer to as mcgo. Also note that it is possible to take a distribution represented by a BN and, in exchange for some loss in independence information, represent it using a DM instead (Koller and Friedman 2009, p.p. 134). Therefore, one might ask, how does the practical runtime of JFComp compare to mcgo when computing the KL divergence between 2 BNs. To answer this question, we will replicate the experiment used by Moral, Cano, and G omez-Olmedo.
Researcher Affiliation	Academia	Loong Kuan Lee1, Nico Piatkowski2, Franc ois Petitjean1, and Geoffrey I. Webb1 1 Department of Data Science and AI, Monash University, Melbourne, Australia 2Fraunhofer IAIS, Schloss Birlinghoven, 53757 Sankt Augustin, Germany
Pseudocode	No	The paper describes algorithmic procedures, but does not include any clearly labeled pseudocode or algorithm blocks.
Open Source Code	Yes	The repository for the implementation for JFComp can be found at: https://lklee.dev/pub/2023-aaai/code
Open Datasets	Yes	They chose a set of BNs from the bnlearn (Scutari 2010) repository (https://www.bnlearn.com/bnrepository/) to sample from and estimated a second BN from these samples.
Dataset Splits	No	The paper mentions sampling data and learning Bayesian Networks but does not provide specific train/validation/test dataset splits, percentages, or absolute sample counts for reproducibility.
Hardware Specification	Yes	We run the experiments on an Intel NUC-10i7FNH with 64GB of RAM.
Software Dependencies	No	The paper states 'The implementation of both methods are in Python and use the pgmpy library', but does not specify version numbers for Python or the pgmpy library.
Experiment Setup	Yes	We repeat this 10 times in order to get an estimate of both methods runtime in seconds.