Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Differentiable Cyclic Causal Discovery Under Unmeasured Confounders

Authors: Muralikrishnna Guruswamy Sethuraman, Faramarz Fekri

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Through experiments on synthetic data and real-world gene perturbation datasets, we show that DCCD-CONF outperforms state-of-the-art methods in both causal graph recovery and confounder identification.
Researcher Affiliation	Academia	Muralikrishnna G. Sethuraman School of Electrical & Computer Engineering Georgia Institute of Technology EMAIL Faramarz Fekri School of Electrical & Computer Engineering Georgia Institute of Technology EMAIL
Pseudocode	Yes	The overall parameter update procedure is summarized in Algorithm 1 in Appendix B.
Open Source Code	Yes	The code for DCCD-CONF is available at the repository: https://github.com/muralikgs/ dccd_conf.
Open Datasets	Yes	Through experiments on synthetic data and real-world gene perturbation datasets... Specifically, we use the Perturb CITE-seq dataset [55], which contains gene expression data from 218,331 melanoma cells across three conditions... We further evaluate DCCD-CONF on a biological dataset for protein signaling network discovery [1], which is widely used as a benchmark for causal discovery algorithms.
Dataset Splits	Yes	Our training data set consists of observational data and single-node interventional over all the nodes in the graph, i.e, I = , {1}, . . . , {d} (unless stated otherwise), with Nk = 500 samples per intervention. To evaluate performance, we split each dataset 90-10, using the smaller portion as the test set, and measure performance using negative log-likelihood (NLL) on the test data after model training (lower the better).
Hardware Specification	Yes	The models were trained and evaluated on NVIDIA RTX6000 GPUs.
Software Dependencies	No	We implemented our framework using the libraries Pytorch and Scikit-learn in Python and the code used in running the experiments can be found in the following Github repository: https://github.com/muralikgs/dccd_conf. The final objective is optimized using the Adam optimizer [64]. For FCI, we used the implementation that is available in the causallearn python library (https://github. com/py-why/causal-learn).
Experiment Setup	Yes	The learning rate in all our experiments was set to 10 2. The neural network models used in our experiments contained one multi-layer perceptron layer. No nonlinearities were added to the neural networks for the linear SEM experiments. We used tanh activation for the nonlinear SEM experiments and for the experiments on the perturb-CITE-seq data set. The graph sparsity regularization constant λ was set to 10 2 for all the experiments. The sparsity inducing regularization constant for the inverse covariance matrix of the confounder distribution, ρ, was set to 10 1 in all the experiments. The final objective is optimized using the Adam optimizer [64].