Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Applications of Common Entropy for Causal Inference

Authors: Murat Kocaoglu, Sanjay Shakkottai, Alexandros G. Dimakis, Constantine Caramanis, Sriram Vishwanath

NeurIPS 2020 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate our algorithms on synthetic and real data to establish their performance.
Researcher Affiliation	Collaboration	Murat Kocaoglu MIT-IBM Watson AI Lab, IBM Research, Sanjay Shakkottai The University of Texas at Austin, Alexandros G. Dimakis The University of Texas at Austin, Constantine Caramanis The University of Texas at Austin, Sriram Vishwanath The University of Texas at Austin
Pseudocode	Yes	Algorithm 1 Latent Search: Iterative Update Algorithm, Algorithm 2 Infer Graph: Identifying the Latent Graph, Algorithm 3 Entropic PC (for F = False) and Entropic PC-C (for F = True)
Open Source Code	No	The paper does not include an unambiguous statement or a direct link indicating that the source code for the described methodology is publicly available.
Open Datasets	Yes	On ADULT Dataset: We compare the performance of Entropic PC with the baseline PC algorithm that we used for our modiﬁcations. ... [13] Dua Dheeru and EﬁKarra Taniskidou. UCI machine learning repository, 2017. ... Second, in Figure 3a, we run Latent Search on the real cause-effect pairs from Tuebingen dataset [36]. ... [36] Joris M Mooij, Jonas Peters, Dominik Janzing, Jakob Zscheischler, and Bernhard Schölkopf. Distinguishing cause from effect using observational data: methods and benchmarks. The Journal of Machine Learning Research, 17(1):1103 1204, 2016.
Dataset Splits	No	The paper mentions generating "datasets with 1k, 5k, 80k samples" for synthetic data, and uses real datasets like ADULT and Tuebingen, but does not explicitly provide specific training/validation/test splits (e.g., percentages or exact counts) or refer to predefined standard splits for all datasets used.
Hardware Specification	No	The paper mentions using "computing resources from TACC" in the acknowledgements, but it does not provide specific details on the hardware used, such as GPU/CPU models, memory, or other detailed computer specifications for running the experiments.
Software Dependencies	No	The paper states: "We use pcalg package in R [24, 18]". However, it does not specify the version numbers for the `pcalg` package or R, which is required for reproducibility.
Experiment Setup	Yes	When using Latent Search to approximate Rényi1 common entropy, we will run it for multiple β values and pick the distribution q(.) with the smallest H(Z) such that I(X; Y \|Z) θ for a practical threshold θ to declare conditional independence. ... To address this, in simulations H( ) is set to 0.1 min{H(X), H(Y )} in line 10. This and the choice of 0.8 as the coefﬁcient in line 16 can be seen as hyper-parameters to be tuned. ... we conclude that 0.001 is a suitable CMI threshold for this dataset.