Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Vector Causal Inference between Two Groups of Variables
Authors: Jonas Wahl, Urmi Ninad, Jakob Runge
AAAI 2023 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate our methods empirically and compare them to other state-of-the-art techniques.In Section 5, we analyse the empirical performance of these algorithms in experiments with synthetic data and compare it to that of other approaches (Vanilla-PC and the Trace Method (Janzing et al. 2009; Zscheischler, Janzing, and Zhang 2012)). We also consider a real world climate science example of surface temperatures in the El Ni no Southern Oscillation (ENSO 3.4) region in the pacific and in British Columbia to test our algorithms. |
| Researcher Affiliation | Academia | Jonas Wahl*,1,2, Urmi Ninad*,1,2, Jakob Runge1,2 1Technische Universit at Berlin 2 DLR Institut f ur Datenwissenschaften Jena |
| Pseudocode | Yes | Algorithm 1: 2G-Vec CI.PC |
| Open Source Code | Yes | All code is available at https://github.com/Jonas Choice/ 2GVec CI. |
| Open Datasets | Yes | NCEP-NCAR Reanalysis 1 data was provided by NOAA PSL, Boulder, Colorado, USA, from their website at https://psl.noaa.gov, see Kalnay et al. (1996). |
| Dataset Splits | No | The paper describes the generation of simulated data and the use of real-world data with varying sample sizes, but it does not specify explicit train/validation/test dataset splits, nor does it mention cross-validation or other detailed splitting methodologies for reproducibility. |
| Hardware Specification | Yes | Computations were done on Bull Sequana XH2000 with AMD 7763 CPUs. |
| Software Dependencies | No | The paper mentions using specific tests and algorithms (e.g., 'partial correlation test', 'Gaussian Process distance correlation independence test', 'PC-algorithm') but does not specify version numbers for any software libraries, frameworks, or dependencies used. |
| Experiment Setup | Yes | Models vary along the following parameters: sample size (between 50 and 500); group sizes n and m (between 3 and 100); edge densities within X and ηY (between 1% and 90% of all possible edges); density of the interaction matrix A (between 1% and 90% of all possible entries non-zero); effect size, i.e. size of the entries in A (uniformly randomly drawn from different intervals).In both algorithms, we test for conditional independencies using the partial correlation test at significance level α = 0.01. We choose the sensitivity parameter to be α = 0.01 if not specified differently. |