Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Reinforcement Learning for Causal Discovery without Acyclicity Constraints

Authors: Bao Duong, Hung Le, Biwei Huang, Thin Nguyen

TMLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In addition, we provide compelling empirical evidence for the strong performance of ALIAS in comparison with state-of-the-arts in causal discovery over increasingly difficult experiment conditions on both synthetic and real datasets. Our implementation is provided at https://github.com/baosws/ALIAS. 6 Numerical Evaluations In this section, we validate our method in the causal discovery task across a comprehensive set of settings, including different nonlinearities, varying graph types, sizes and densities, varying sample sizes, as well as different degrees of model misspecification on both synthetic and real data.
Researcher Affiliation	Academia	Bao Duong EMAIL Applied Artificial Intelligence Institute (A2I2), Deakin University Hung Le EMAIL Applied Artificial Intelligence Institute (A2I2), Deakin University Biwei Huang EMAIL University of California, San Diego Thin Nguyen EMAIL Applied Artificial Intelligence Institute (A2I2), Deakin University
Pseudocode	Yes	Algorithm 1 ALIAS with vanilla policy gradient for causal discovery.
Open Source Code	Yes	Our implementation is provided at https://github.com/baosws/ALIAS.
Open Datasets	Yes	We conduct extensive empirical evaluations on both simulated and real datasets, where the ground truth DAGs are available, to compare the efficiency of the proposed ALIAS method with up-to-date state-of-the-arts in causal discovery... Next, to confirm the validity of our method past synthetic data, we evaluate it on the popular benchmark flow cytometry dataset (Sachs et al., 2005)... We evaluate the performance of the proposed method ALIAS with competitors on the exact 5 datasets used by Zhu et al. (2020) in their experiment, which are produced by Lachapelle et al. (2020) (https: //github.com/kurowasan/Gra N-DAG, MIT license).
Dataset Splits	No	For a given number of nodes d, we first generate a DAG following the Erdős-Rényi graph model (Erdős & Rényi, 1960)... We then sample n = 1 000 observations for each dataset. This causal model is identifiable due to the equal noise variances (Peters et al., 2014)... We again consider the difficult configuration of ER-8 graphs, and vary the sample size from very limited (100) to redundant (5 000) in Figure 3... We employ the observational partition of the dataset with 853 samples, 11 nodes, and 17 edges. The paper describes generating datasets with specific sample sizes and using an
Hardware Specification	Yes	Experiments are executed on a mix of several machines running Ubuntu 20.04/22.04 with the matching Python environments, including the following configurations: AMD EPYC 7742 CPU, 1TB of RAM, and 8 Nvidia A100 40GB GPUs. Intel Xeon Platinum 8452Y CPU, 1TB of RAM, 4 Nvidia H100 80GB GPUs. Intel Core i9 13900KF CPU, 128GB of RAM, 1 Nvidia 4070Ti Super 16GB GPU.
Software Dependencies	No	Our proposed ALIAS method is implemented using the Stable-Baselines33 toolset (Raffin et al., 2021) with the Advantage Actor-Critic (A2C, Mnih etol., 2016) and Proximal Policy Optimization (PPO, Schulman et al., 2017) methods, and a custom DAG environment built on top of Gymnasium4 (Towers et al., 2023).
Experiment Setup	Yes	C.2.3 Hyper-parameters We provide the hyper-parameters for each method and experiment scenario as follows: ALIAS (Ours): Table 7. (and subsequent tables for other methods).