Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Root Cause Analysis of Outliers with Missing Structural Knowledge

Authors: William Roy Orchard, Nastaran Okati, Sergio Garrido Mejia, Patrick Blöbaum, Dominik Janzing

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Finally, in Section 4 we compare existing approaches to RCA to our own proposals on both synthetic and real-world data, demonstrating that despite their simplicity, our approaches are competitive.
Researcher Affiliation	Collaboration	William Roy Orchard University of Cambridge Cambridge, UK EMAIL Nastaran Okati Max Planck Institute for Software Systems Kaiserslautern, Germany Sergio Hernan Garrido Mejia Max Planck Institute for Intelligent Systems Amazon Tübingen, Germany Patrick Blöbaum Amazon Tübingen, Germany Dominik Janzing Amazon Tübingen, Germany
Pseudocode	Yes	Algorithm 1 (Smooth Traversal) Returns the variable with the largest positive score difference to its highest scoring parent
Open Source Code	Yes	Code: https://github.com/amazon-science/RCAWith Missing Structural Knowledge Code
Open Datasets	Yes	We additionally performed a comprehensive evaluation with two real-world datasets: the Pet Shop [27] dataset (see Appendix F.6) and the Sock-shop 2 dataset [26] (see Appendix F.7), as well as semi-synthetic datasets generated using the Pro RCA package [48] (see Appendix F.8).
Dataset Splits	Yes	In all experiments, 1000 observations are drawn according to the randomly assigned model in the normal period. To produce an anomalous sample, a root cause is chosen at random from among the nodes and a target node from among its descendants (including itself). An anomaly is then injected at the root cause by adding x multiples of its standard deviation to its value, and propagated through the causal model.
Hardware Specification	Yes	All the experiments were run on a Mac Book Pro with 16GB of memory with an Apple M1 processor.
Software Dependencies	No	To run Circa and Counterfactual, we used the implementations available in [27] using default parameters. We wrote our own implementation of Traversal... The code for Cholesky is available at [46]
Experiment Setup	Yes	The parametric forms of the structural equations are randomly assigned to be either a simple feed-forward neural network with a probability of 0.8 (to account for non-linear models) and a linear model. The feed-forward neural network has three layers (input layer, hidden layer, and output layer) where the hidden layer has a number of nodes chosen randomly between 2 and 100. All the parameters of the neural network are sampled from a uniform distribution between -5 and 5. For the linear model, we sample the coefficients of the linear model from a uniform distribution between -1 and 1 and set the intercept to 0. In both cases, we use additive Gaussian noise as the relation between the noise and the variables.