Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Generalized Independent Noise Condition for Estimating Latent Variable Causal Graphs

Authors: Feng Xie, Ruichu Cai, Biwei Huang, Clark Glymour, Zhifeng Hao, Kun Zhang

NeurIPS 2020 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results on synthetic and real-world data demonstrate the effectiveness of our method.
Researcher Affiliation	Academia	Feng Xie 1,2, Ruichu Cai 1,3, Biwei Huang4, Clark Glymour4, Zhifeng Hao1,5, Kun Zhang 4 1 School of Computer Science, Guangdong University of Technology, Guangzhou, China 2 School of Mathematical Sciences, Peking University, Beijing, China 3 Pazhou Lab, Guangzhou, China 4 Department of Philosophy, Carnegie Mellon University, Pittsburgh, USA 5 School of Mathematics and Big Data, Foshan University, Foshan, China
Pseudocode	Yes	Algorithm 1 Identifying Causal Clusters and Algorithm 2 Learning the Causal Order of Latent Variables are explicitly provided in the paper.
Open Source Code	Yes	Our source code is available from https://github.com/xiefeng009/ GIN-Condition-for-Estimating-Latent-Variable-Causal-Graphs.
Open Datasets	Yes	Barbara Byrne conducted a study to investigate the impact of organizational (role ambiguity, role conﬂict, classroom climate, and superior support, etc.) and personality (selfesteem, external locus of control) on three facets of burnout in full-time elementary teachers [Byrne, 2010]. We applied our algorithm to this data set, with 28 observed variables in total.
Dataset Splits	No	The paper describes the generation of synthetic data with sample sizes (N = 500, 1000, 2000) and mentions a real-world dataset but does not specify explicit train/validation/test splits or percentages for either. It mentions 'Each experiment was repeated 10 times' for synthetic data, but this is a repetition count, not a data split.
Hardware Specification	No	The paper does not provide specific details about the hardware used to run the experiments, such as GPU models, CPU types, or memory specifications.
Software Dependencies	No	The paper mentions using 'Hilbert-Schmidt Independence Criterion (HSIC) test' and 'TETRAD package' for comparisons, but it does not provide specific version numbers for these or any other software dependencies.
Experiment Setup	Yes	In all four cases, the data are generated by Li NGLa M and the causal strength b is sampled from a uniform distribution between [ 2, 0.5] [0.5, 2], noise terms are generated from uniform[-1,1] variables to the ﬁfth power, and the sample size N = 500, 1000, 2000. Each experiment was repeated 10 times with randomly generated data and the results were averaged. In the implementation, the kernel width in the HSIC test is set to 0.05.