Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Less Greedy Equivalence Search

Authors: Adiba Ejaz, Elias Bareinboim

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments demonstrate that LGES outperforms GES and other baselines in speed, accuracy, and robustness to misspecified knowledge. Our code is available at https://github.com/Causal AILab/lges.
Researcher Affiliation Academia Adiba Ejaz Elias Bareinboim Causal Artificial Intelligence Lab Columbia University EMAIL
Pseudocode Yes Algorithm 1: Less Greedy Equivalence Search (LGES) Input: Data D P(v), scoring criterion S, prior assumptions S = R, F , initial MEC E0, insertion strategy Get Insert in {GETSAFEINSERT, GETCONSERVATIVEINSERT} Output: MEC E of P(v)
Open Source Code Yes Our code is available at https://github.com/Causal AILab/lges.
Open Datasets Yes Sachs dataset. We compare GES and LGES on a real-world protein signaling dataset [48]. The observational dataset consists of 853 measurements of 11 phospholipids and phosphoproteins... from the bnlearn repository.7
Dataset Splits No The paper describes generating synthetic data with specific parameters (e.g., 'sample 50 graphs and generate linear-Gaussian data for each graph', 'We obtain samples of size n {103, 104} via sempler [20]'). For the real-world Sachs dataset, no specific train/test/validation splits are mentioned. The paper describes the data generation process and total sample sizes, but not how these might be split for training/testing.
Hardware Specification Yes Compute details. All experiments were run on a shared compute cluster with 2x Intel Xeon Platinum 8480+ CPUs (112 cores total, 224 threads) at up to 3.8 GHz, and 210 Mi B L3 cache.
Software Dependencies No All implementations in Python. GES variants are implemented by modifying the code in https://github.com/juangamella/ges. We use the PC implementation in causal-learn [62] and No Tears implementation in causal-nex [5]. The paper mentions software names but does not provide specific version numbers for Python or the libraries used.
Experiment Setup Yes We draw Erd os Rényi graphs with p variables and {1, 2, 3} p edges in expectation, denoted ER-{1, 2, 3} respectively). We run most experiments for p up to 150... For each p, we sample 50 graphs and generate linear-Gaussian data for each graph. Following [37], we draw weights from U([ 2, 0.5] [0.5, 2]) and noise variances from U([0.1, 0.5]). We obtain samples of size n {103, 104} via sempler [20]. We ran the PC algorithm using significance level α = 0.05 for conditional independence tests, with the null hypothesis of independence. We ran No Tears with default parameters from the causalnex library, which uses a weight threshold of w = 0.