Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Learning latent causal graphs via mixture oracles
Authors: Bohdan Kivva, Goutham Rajendran, Pradeep Ravikumar, Bryon Aragam
NeurIPS 2021 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We implement these algorithms as part of an end-to-end pipeline for learning the full causal graph and illustrate its performance on simulated data. To test this pipeline, we ran experiments on synthetic data. Full details about these experiments, including a detailed description of the entire pipeline, can be found in Appendix G in the supplement. |
| Researcher Affiliation | Academia | Bohdan Kivva University of Chicago EMAIL Goutham Rajendran University of Chicago EMAIL Pradeep Ravikumar Carnegie Mellon University EMAIL Bryon Aragam University of Chicago EMAIL |
| Pseudocode | No | The paper describes algorithms but does not include a formally labeled 'Pseudocode' or 'Algorithm' block. |
| Open Source Code | Yes | 3. If you ran experiments... (a) Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [Yes] See supplementary materials |
| Open Datasets | No | To test this pipeline, we ran experiments on synthetic data. Full details about these experiments, including a detailed description of the entire pipeline, can be found in Appendix G in the supplement. Data generation We start with a causal DAG ๐บgenerated from the Erdรถs-Rรฉnyi model, for different settings of ๐, ๐and |โฆ๐|. We then generate samples from the probability distribution that corresponds to ๐บ. |
| Dataset Splits | No | Figure 3 reports the results of 600 simulations; 300 each for ๐= 10000 samples and ๐= 15000 samples. The paper focuses on generating data and then running experiments on it, but does not explicitly define how this generated data is split into train/validation sets in the main text. |
| Hardware Specification | No | The paper states 'See supplementary materials' for compute and resource details (Section 3.d), but no specific hardware models (GPU, CPU, etc.) are mentioned within the main body of the paper. |
| Software Dependencies | No | In our implementation, we used ๐พ-means. In our experiments, we use the Fast Greedy Equivalence Search [57] with the discrete BIC score, without assuming faithfulness. No specific version numbers for software or libraries are provided. |
| Experiment Setup | No | Full details about these experiments, including a detailed description of the entire pipeline, can be found in Appendix G in the supplement. The main text describes the general approach but lacks specific hyperparameter values or detailed training configurations for the experimental setup. |