Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Learning Treewidth-Bounded Bayesian Networks with Thousands of Variables

Authors: Mauro Scanagatta, Giorgio Corani, Cassio P. de Campos, Marco Zaffalon

NeurIPS 2016 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We compare k-A*, k-G, S2, S2+ and TWILP in various experiments. We compare them through an indicator which we call W-score: the percentage of worsening of the BIC score... For the ﬁrst time we present experimental results for structural learning with bounded treewidth for domains involving up to ten thousand variables.
Researcher Affiliation	Academia	Mauro Scanagatta IDSIA , SUPSI , USI Lugano, Switzerland EMAIL Giorgio Corani IDSIA , SUPSI , USI Lugano, Switzerland EMAIL Cassio P. de Campos Queen s University Belfast Northern Ireland, UK EMAIL Marco Zaffalon IDSIA Lugano, Switzerland EMAIL
Pseudocode	No	The paper describes the algorithms k-A* and k-G in prose but does not provide them in a structured pseudocode or algorithm block.
Open Source Code	Yes	Software and supplementary material are available from http://blip.idsia.ch.
Open Datasets	Yes	We now present experiments on the data sets considered by Nie et al. (2016). They involve up to 100 variables. We consider 10 large data sets (100 n 400) listed in Table 3. Eventually we consider 14 very large data sets, containing between 400 and 10000 variables... three randomly-generated synthetic data sets... generated using the software BNGenerator 4. http://sites.poli.usp.br/pmr/ltd/Software/BNGenerator/
Dataset Splits	No	The paper does not provide specific train/validation/test dataset split information. It mentions 'a complete data set of N instances D = {D1, ..., DN}' and 'We split each data set randomly into three subsets' for experimental purposes, but these are not train/validation/test splits.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., GPU/CPU models, memory amounts) used for running its experiments.
Software Dependencies	No	The paper mentions 'Gobnilp solver' and 'Max SAT solver' without version numbers. It mentions 'BNGenerator 4' for synthetic data generation, but this is not a core software dependency for the main methodology.
Experiment Setup	Yes	We allow 60 seconds of time for the computation of the scores of the parent set of each variable, in each data set. We allow each method to run for ten minutes. We let each method run for one hour. We set the bounded treewidth to k = 4. We consider the following treewidths: k {2, 5, 8}. All variables are binary and we sample their conditional probability tables from a Beta(1,1). We sample 10,000 instances from each generated inverted tree.