Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Learning Linear Non-Gaussian Causal Models in the Presence of Latent Variables

Authors: Saber Salehkaleybar, AmirEmad Ghassami, Negar Kiyavash, Kun Zhang

JMLR 2020 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments on synthetic data and real-world data show the effectiveness of our proposed algorithm for learning causal models.
Researcher Affiliation	Academia	Saber Salehkaleybar EMAIL Department of Electrical Engineering Sharif University of Technology, Tehran, Iran Amir Emad Ghassami EMAIL Department of ECE University of Illinois at Urbana-Champaign Urbana, IL 61801 Negar Kiyavash EMAIL College of Management of Technology Ecole Polytechnique F ed erale de Lausanne (EPFL) Kun Zhang EMAIL Department of Philosophy Carnegie Mellon University Pittsburgh, PA 15213
Pseudocode	Yes	Algorithm 1 1: Input: Collection of the sets deso(Vi), 1 i po. 2: Run an over-complete ICA algorithm over observed variables Vo and obtain matrix B . 3: for i = 1 : pr do 4: Ii = {k\|[ B :,i]k = 0} 5: for j = 1 : po do 6: if Ii = deso(Vj) then 7: [ˆBo]:,j = B :,i/[ B :,i]j 8: end if 10: end for 11: Output: ˆBo
Open Source Code	No	The paper does not contain an explicit statement about releasing source code or a link to a code repository for the methodology described.
Open Datasets	Yes	We considered the daily closing prices of the following world stock indicies from 10/12/2012 to 10/12/2018, obtained from Yahoo ﬁnancial database: Dow Jones Industrial Average (DJI) in USA, Nikkei 225 (N225) in Japan, Euronext 100 (N100) in Europe, Hang Seng Index (HSI) in Hong Kong, and the Shanghai Stock Exchange Composite Index (SSEC) in China.
Dataset Splits	Yes	First, for the causal graph in Figure 1, we generated 1000 samples of observed variables V1 and V2... In order to estimate the number of columns of B , we held out 250 of samples for model selection.
Hardware Specification	No	The paper does not specify any particular hardware (e.g., GPU/CPU models, memory) used for running the experiments.
Software Dependencies	No	The paper mentions several algorithms such as RICA, lv Li NGAM, Direct-Li NGAM, and FCI, but it does not provide specific version numbers for any of these software dependencies or libraries.
Experiment Setup	Yes	First, for the causal graph in Figure 1, we generated 1000 samples of observed variables V1 and V2 where nonzero entries of matrix A is equal to 0.9. We utilized the Reconstruction ICA (RICA) algorithm (Le et al., 2011) to solve the over-complete ICA problem as follows: ... parameter λ controls the cost of penalty term. We estimated matrix B by UΣ1/2Z where Z is the optimal solution of the above optimization problem. In order to estimate the number of columns of B , we held out 250 of samples for model selection. More speciﬁcally, we solved the over-complete ICA problem for different number of columns, evaluated the ﬁtness of each model by computing the objective function of RICA over the hold-out set, and selected the model with minimum cost. In order to check whether an entry is equal to zero, we used the bootstrapping method (Efron and Tibshirani, 1994), which generates 10 bootstrap samples by sampling with replacement from training data. For each bootstrap sample, we executed RICA algorithm to obtain an estimation of B . ... Afterwards, we used a t-test with conﬁdence level of 95% to check whether an entry is equal to zero from the bootstrap samples.