Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Faster Generic Identification in Tree-Shaped Structural Causal Models

Authors: Yasmine Briefs, Markus Bläser

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We present an improved algorithm with running time O(n3 log2 n) and demonstrate its feasibility by providing an implementation that outperforms existing state-of-the-art implementations. [...] 5 Benchmarking [...] Figure 4: Running times on the 879 graphs benchmark
Researcher Affiliation Academia Yasmine Briefs Max-Planck-Institute for Informatics Saarland Informatics Campus Saarbrücken, Germany EMAIL Markus Bläser Saarland University Saarland Informatics Campus Saarbrücken, Germany EMAIL
Pseudocode Yes Algorithm 1 Identification in tree-shaped SCMs Input: A tree-shaped mixed graph M = (V, D, B) Output: For each λp,i, we output whether it is generically identifiable, 2-identifiable, or unidentifiable. In the first two cases, we output corresponding FASTPs.
Open Source Code Yes We provide our C++ implementation as the open-source R package fasttreeid, available on Git Hub [Briefs and Bläser, 2025b] and CRAN [R Core Team, 2025, Briefs and Bläser, 2025a]. [...] The code is made available in the supplementary material and will be made public after reviewing. We also provide a Docker environment in which all programs can be executed.
Open Datasets Yes Van der Zander et al. [2022] provide 879 test cases with 8 nodes each. The directed edges form a line 0 1 2 3 4 5 6 7.
Dataset Splits No The paper mentions using '879 test cases' and generating random graphs, but it does not specify training/test/validation dataset splits in the context of model training or evaluation. The algorithms are tested on sets of graph instances rather than partitioned datasets for learning tasks.
Hardware Specification Yes All experiments were carried out on a Debian Linux server equipped with an AMD EPYC 7702 64-core processor running at 3.35GHz and an overall memory of 2TB.
Software Dependencies Yes We provide our C++ implementation as the open-source R package fasttreeid... We tested it against the tree ID-algorithm, which is part of the DAGitty package [Textor et al., 2016]. We used the current version 3.1. ... SEMID [Barber et al., 2023], we used the current version 0.4.1. ... Open SSL Project. Openssl: The open source toolkit for SSL/TLS. https://www.openssl.org, Jan 2024. Version 3.0.13. ... Torbjörn Granlund and the GMP development team. GNU MP: The GNU multiple precision arithmetic library. Release 6.3.0, 2023. URL https://gmplib.org/.
Experiment Setup Yes All experiments were carried out on a Debian Linux server equipped with an AMD EPYC 7702 64-core processor running at 3.35GHz and an overall memory of 2TB. [...] The overall running time of our algorithm does not depend too much on the actual graph structure, since we mainly use linear algebra operations. [...] All three programs were run on a single core with a time limit of 5 minutes per test case. tree ID was run with 16 Gi B of memory and did not terminate within the time limit for n >= 13. [...] tree ID exceeds the memory limit of 256 Gi B for n = 30 and n >= 33.