reproducibilityindex.ai

A Bayesian Approach for Estimating Causal Effects from Observational Data

Authors: Johan Pensar, Topi Talvitie, Antti Hyttinen, Mikko Koivisto5395-5402

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluated the performance of our method by three empirical studies. The ﬁrst study uses simulated data to examine the behaviour of our posterior and, in particular, the accuracy of the resulting estimates as compared to those of different IDA variants. In the second study, we assess the accuracy of our method as well as ancestor relation probabilities in terms of causal effect discovery. The third study demonstrates the applicability of our approach to real-world data.
Researcher Affiliation	Academia	Johan Pensar Dept. of Math. and Stat. University of Helsinki johan.pensar@helsinki.ﬁ Topi Talvitie Dept. of Computer Science University of Helsinki topi.talvitie@helsinki.ﬁ Antti Hyttinen HIIT & Dept. of CS University of Helsinki antti.hyttinen@helsinki.ﬁ Mikko Koivisto Dept. of Computer Science University of Helsinki mikko.koivisto@helsinki.ﬁ
Pseudocode	Yes	Algorithm 1 Computing the unnormalized parent set and ancestor relation probabilities. 1: Compute the zeta transform ˆwv of the local weight function wv for each v V . 2: Compute the forward function f and the auxilliary function g using Eqs. (12) and (14). 3: Compute the backward function bi for each i V using Eq. (13). 4: Compute the weight Wi(S) for each i V and S V \{i} using Eq. (10). 5: Compute the weight Wi,j for each pair i, j V using Eq. (11).
Open Source Code	Yes	The code package and the supplementary material are available at https://github.com/jopensar/BIDA.
Open Datasets	Yes	Finally, we apply our method on observational ﬂow cytometry data... As an example of a possible application for our method, we consider the ﬂow cytometry data (Sachs et al. 2005).
Dataset Splits	No	The paper mentions generating data sets with increasing sample sizes (n=50, 200, 800) but does not specify explicit train/validation/test splits for these datasets within the main text.
Hardware Specification	Yes	For a data set on 20 variables, the computations take about 25 minutes on a modern laptop computer (single thread, Intel Core i7-6600U, 2.60 GHz).
Software Dependencies	No	The paper states that Algorithm 1 was implemented in C++ and the rest of the method in R, but it does not provide specific version numbers for these languages or any dependent software libraries/packages.
Experiment Setup	Yes	For calculating the parent set and ancestor relation probabilities, we limited the parent set size to 6 and used the fractional marginal likelihood (Consonni and Rocca 2012), with αΩ = d - 1 and n0 = 1, together with a uniform graph prior. The hyperparameters in the prior for the linear model (5) were set as follows: m0 = 0, Λ0 = I, and a0 = b0 = 1.