Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Practical Kernel Selection for Kernel-based Conditional Independence Test

Authors: Wenjie Wang, Mingming Gong, Biwei Huang, James Bailey, Bo Han, Kun Zhang, Feng Liu

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Furthermore, we conduct extensive experiments on both synthetic and real-world datasets to empirically validate the effectiveness of our method.
Researcher Affiliation	Academia	1The University of Melbourne 2University of California, San Diego 3Hong Kong Baptist University 4Carnegie Mellon University 5Mohamed bin Zayed University of Artificial Intelligence
Pseudocode	Yes	Algorithm 1: Overall Procedure of Power
Open Source Code	Yes	The code is available at: https://github.com/wenjiewang3/Power KCI
Open Datasets	Yes	extensive experiments on both synthetic and real-world datasets to empirically validate the effectiveness of our method. ...experiments on a real-world conditional independence benchmark (car insurance dataset) in Appendix F.1.1... ...Additionally, we conducted experiments on the real-world causal discovery benchmarks SACHs [Sachs et al., 2005] and CHILD [Spiegelhalter et al., 1993].
Dataset Splits	Yes	We evenly divide all samples into a training set and a testing set. ...Following the default setting in Polo et al. [2023], the dataset is split 70/30% for training and testing. ...For each cases involves 10 variables with sample sizes of n = 500, which are evenly divided into training data and testing data.
Hardware Specification	Yes	All experiments were conducted on an Intel 14700K CPU platform with 32GB of RAM, without GPU acceleration.
Software Dependencies	No	The paper mentions software components like 'joblib package', 'XGBoost', and 'Adam optimizer', but no specific version numbers are provided for these or other key software components used in the experiments.
Experiment Setup	Yes	For the kernel parameters of ϕx and ϕy, we use the median heuristic as the initial value and apply different weights. Specifically, we take the median heuristic as a sensible initialization and use the candidate weight list [0.1,0.3,0.75,0.88,1,1.25,1.5,3,5,10], applying each weight as a multiplier to the median-based bandwidth. ...The significance level is set to the default value of 0.05. ...The amplitude A is limited to the range of [10 3,103]. The bandwidth σr z is a vector whose dimensions are the same as those of the conditioning variable Z, with values constrained to [10 2,102]. The regularization parameter ε is constrained to [10 10,1]. We use marginal likelihood as the loss function and the L-BFGS-B algorithm [Liu and Nocedal, 1989] to optimize and update these parameters.