Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Minimal I-MAP MCMC for Scalable Structure Discovery in Causal DAG Models
Authors: Raj Agrawal, Caroline Uhler, Tamara Broderick
ICML 2018 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In Section 6 we empirically compare our model to order MCMC and partition MCMC (Kuipers & Moffa, 2017), the state-of-the-art version of structure MCMC. In experiments we observe O(p3) time scaling for our method, and we demonstrate better mixing and ROC performance for our method on several datasets. |
| Researcher Affiliation | Academia | 1Computer Science and Artificial Intelligence Laboratory 2Institute for Data, Systems and Society 3Laboratory for Information and Decision Systems, Massachusetts Institute of Technology. Correspondence to: Raj Agrawal <EMAIL, EMAIL>. |
| Pseudocode | Yes | Algorithm 1 Minimal I-MAP MCMC; Algorithm 2, denoted as update minimal I-MAP (UMI), is used as a step in Algorithm 1 and describes how to compute a minimal I-MAP ˆG from a minimal I-MAP ˆG when and differ by an adjacent transposition without recomputing all edges; see also Solus et al. (2017). |
| Open Source Code | No | The paper states: "In terms of software, we used the code provided by Kuipers & Moffa (2017) to run partition and order MCMC. We used the method and software of Kangas et al. (2016) for counting linear extensions for bias correction, and we implemented minimal I-MAP MCMC using the R-package bnlearn." This indicates they used third-party code but does not state that their own source code is provided or made publicly available. |
| Open Datasets | Yes | The third dataset is from the Dream4 in-silico network challenge (Schaffter et al., 2011) on gene regulation. |
| Dataset Splits | No | The paper does not explicitly mention the use of a validation set or specific training/validation/test splits. It discusses burn-in and thinning for the MCMC chains but not dataset partitioning for model training vs. evaluation. |
| Hardware Specification | No | The paper does not provide any specific hardware details such as GPU/CPU models, memory, or cloud computing instance types used for running the experiments. |
| Software Dependencies | No | The paper mentions implementing minimal I-MAP MCMC using the "R-package bnlearn" but does not specify version numbers for R, bnlearn, or any other software dependencies. |
| Experiment Setup | Yes | For each dataset, we ran the Markov chains for 105 iterations, including a burn-in of 2 × 104 iterations, and thinned the remaining iterations by a factor of 100. [...] To achieve this end, we choose a prior of the form P(G) = P(G) exp(-γ|A|) where P(G) can include any structural information known about the DAG. [...] Input: Data D, number of iterations T, significance level α, initial permutation π0, sparsity strength γ, thinning rate τ. |