Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Identifying Causal Mechanism Shifts Under Additive Models with Arbitrary Noise

Authors: Yewei Xia, Xueliang Cui, Hao Zhang, Yixin Ren, Feng Xie, Jihong Guan, Ruxin Wang, Shuigeng Zhou

IJCAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Evaluated on various synthetic datasets, CMSI consistently outperforms existing baselines in terms of F1 score. Additionally, we demonstrate CMSI s applicability on gene expression datasets of ovarian cancer patients at different disease stages. We evaluate the performance and applicability of our method by extensive experiments on both synthetic and ovarian cancer datasets.
Researcher Affiliation	Academia	1Shanghai Key Lab of Intelligent Information Processing, and School of Computer Science, Fudan University, Shanghai, China 2SIAT, Chinese Academy of Sciences, Shenzhen, China 3Southern University of Science and Technology, Shenzhen, China 4Department of Applied Statistics, Beijing Technology and Business University, Beijing, China 5Department of Computer Science and Technology, Tongji University, Shanghai, China
Pseudocode	Yes	Algorithm 1 Regression the Score on Residual (Re SR) Input: Dataset X. Output: Estimator ˆg(R) of the score s(X). Algorithm 2 Causal Mechanism Shifts Identification (CMSI) Input: Dataset X1, ..., XH. Output: Estimated shifted variables set ˆI.
Open Source Code	No	The paper does not provide concrete access to source code for the methodology described. It mentions using NVIDIA's cu ML library and the Python library kneed, but not that the authors' own implementation code is released.
Open Datasets	Yes	We evaluated CMSI on an ovarian cancer dataset [Tothill et al., 2008] that was previously analyzed by i SCAN [Chen et al., 2024b] and DCI [Wang et al., 2018].
Dataset Splits	Yes	The default sample size in each environment equals 500. ...For example, consider a scenario with H = 3 environments, each containing d = 20 nodes. ...We evaluated CMSI on an ovarian cancer dataset [Tothill et al., 2008] ...divided into two subsets based on survival duration.
Hardware Specification	Yes	Experiments were conducted on a system equipped with an Intel Xeon(R) Platinum 8255C CPU and two NVIDIA Ge Force RTX 2080 Ti GPUs.
Software Dependencies	No	The paper mentions using "NVIDIA s cu ML library" and "the Python library kneed" but does not specify version numbers for these software components.
Experiment Setup	Yes	The regularization coefficient in the kernel ridge regression is set to α = 0.1, and we use the radial basis function (RBF) kernel with a width parameter of γ = 0.1. ...Based on the generated causal graph and the additive noise model (noise {Gaussian(0, 1), Laplace(0, 1), Gumbel(0, 1), Exponential(1), Beta(1, 1), Gamma(0.5, 0.5)})...