Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Near-Optimal Experiment Design in Linear non-Gaussian Cyclic Models

Authors: Ehsan Sharifian, Saber Salehkaleybar, Negar Kiyavash

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our simulation results show that performing a small number of interventions guided by our stochastic optimization framework recovers the true underlying causal structure. Experiments show that our adaptive strategy outperforms other heuristic methods and closely matches the feedback vertex set (FVS) lower bound (see more details in Section ??).
Researcher Affiliation	Academia	Ehsan Sharifian EPFL, Lausanne, Switzerland EMAIL Saber Salehkaleybar Leiden University, Leiden, The Netherlands EMAIL Negar Kiyavash EPFL, Lausanne, Switzerland EMAIL
Pseudocode	No	The paper describes a greedy policy and refers to (E.2) for a 'fast greedy heuristic' in the appendix, but it does not contain a structured pseudocode or algorithm block in the main text.
Open Source Code	Yes	The codes are available as a supplementary material to faithfully reproduce the results. (NeurIPS Paper Checklist, Q5)
Open Datasets	No	Experiments are conducted on synthetic data generated from Erd os Rényi random directed graphs, where each possible directed edge (excluding self-loops) is included independently with a fixed probability. The paper describes the generation process but does not provide concrete access information (link, DOI, or repository) for the generated synthetic data itself.
Dataset Splits	No	Experiments are conducted on synthetic data generated from Erd os Rényi random directed graphs. The paper does not mention any explicit training, validation, or test dataset splits, as it primarily focuses on simulations with generated data.
Hardware Specification	Yes	Compute resources are reported in the appendix. (NeurIPS Paper Checklist, Q8)
Software Dependencies	No	The main text mentions applying the 'Fast ICA algorithm [15]', but it does not specify version numbers for this or any other software libraries or dependencies used in the experiments.
Experiment Setup	Yes	Experiments are conducted on synthetic data generated from Erd os Rényi random directed graphs, where each possible directed edge (excluding self-loops) is included independently with a fixed probability. The adaptive method uses a bipartite representation of the graph and samples perfect matchings using two modes: exact, where all matchings are enumerated, and sample, where a fast greedy heuristic is used (E.2). We compare our adaptive strategy against two baselines: Random, which selects a target uniformly at random, and Max Degree, which chooses the node with the highest degree in the bipartite graph. To assess the practical viability of our method, we evaluate its performance in a more realistic setting without an ideal ICA oracle. Instead, we apply the Fast ICA algorithm [15] to finite samples generated from both observational and interventional distributions. We introduce two key algorithmic modifications for robustness: adaptive thresholding of matrix entries and a safe matching procedure to prevent incorrect row assignments. A detailed description of these modifications is provided in Appendix E.3. (Section 8 and 8.2)