Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Less Greedy Equivalence Search

Authors: Adiba Ejaz, Elias Bareinboim

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments demonstrate that LGES outperforms GES and other baselines in speed, accuracy, and robustness to misspecified knowledge. Our code is available at https://github.com/Causal AILab/lges.
Researcher Affiliation	Academia	Adiba Ejaz Elias Bareinboim Causal Artificial Intelligence Lab Columbia University EMAIL
Pseudocode	Yes	Algorithm 1: Less Greedy Equivalence Search (LGES) Input: Data D P(v), scoring criterion S, prior assumptions S = R, F , initial MEC E0, insertion strategy Get Insert in {GETSAFEINSERT, GETCONSERVATIVEINSERT} Output: MEC E of P(v)
Open Source Code	Yes	Our code is available at https://github.com/Causal AILab/lges.
Open Datasets	Yes	Sachs dataset. We compare GES and LGES on a real-world protein signaling dataset [48]. The observational dataset consists of 853 measurements of 11 phospholipids and phosphoproteins... from the bnlearn repository.7
Dataset Splits	No	The paper describes generating synthetic data with specific parameters (e.g., 'sample 50 graphs and generate linear-Gaussian data for each graph', 'We obtain samples of size n {103, 104} via sempler [20]'). For the real-world Sachs dataset, no specific train/test/validation splits are mentioned. The paper describes the data generation process and total sample sizes, but not how these might be split for training/testing.
Hardware Specification	Yes	Compute details. All experiments were run on a shared compute cluster with 2x Intel Xeon Platinum 8480+ CPUs (112 cores total, 224 threads) at up to 3.8 GHz, and 210 Mi B L3 cache.
Software Dependencies	No	All implementations in Python. GES variants are implemented by modifying the code in https://github.com/juangamella/ges. We use the PC implementation in causal-learn [62] and No Tears implementation in causal-nex [5]. The paper mentions software names but does not provide specific version numbers for Python or the libraries used.
Experiment Setup	Yes	We draw Erd os Rényi graphs with p variables and {1, 2, 3} p edges in expectation, denoted ER-{1, 2, 3} respectively). We run most experiments for p up to 150... For each p, we sample 50 graphs and generate linear-Gaussian data for each graph. Following [37], we draw weights from U([ 2, 0.5] [0.5, 2]) and noise variances from U([0.1, 0.5]). We obtain samples of size n {103, 104} via sempler [20]. We ran the PC algorithm using significance level α = 0.05 for conditional independence tests, with the null hypothesis of independence. We ran No Tears with default parameters from the causalnex library, which uses a weight threshold of w = 0.