reproducibilityindex.ai

Scalable Sensitivity and Uncertainty Analyses for Causal-Effect Estimates of Continuous-Valued Interventions

Authors: Andrew Jesson, Alyson Douglas, Peter Manshausen, Maëlys Solal, Nicolai Meinshausen, Philip Stier, Yarin Gal, Uri Shalit

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Here we empirically validate our method. First, we consider a synthetic structural causal model (SCM) to demonstrate the validity of our method. Next, we show the scalability of our methods by applying them to a real-world climate-science-inspired problem.
Researcher Affiliation	Academia	Andrew Jesson OATML Department of Computer Science University of Oxford Alyson Douglas AOPP Department of Physics University of Oxford Peter Manshausen AOPP Department of Physics University of Oxford Maëlys Solal Department of Computer Science University of Oxford Nicolai Meinshausen Seminar for Statistics Department of Mathematics ETH Zurich Philip Stier AOPP Department of Physics University of Oxford Yarin Gal OATML Department of Computer Science University of Oxford Uri Shalit Machine Learning and Causal Inference in Healthcare Lab Technion Israel Institute of Technology
Pseudocode	Yes	Algorithm 1 Grid Search Interval Optimizer
Open Source Code	Yes	Implementation details (appendix H), datasets (appendix G), and code are provided at https://github.com/oatml/overcast.
Open Datasets	Yes	The datasets used in this work are publicly available. We use daily observed 1 1 means of clouds, aerosol, and the environment from sources shown in Table 1 of Appendix G. MODIS data from the NASA Earthdata system [BP06]... MERRA-2 reanalysis data [BAC+15, GMS+17]...
Dataset Splits	Yes	For the climate dataset, we use an 80/20 train/test split. The dataset contains 3 million daily observations for 15 years, so this gives us a test set of 600,000 observations. We use a 10% validation set from the training data for early stopping and hyperparameter optimization.
Hardware Specification	Yes	All of our models are trained on NVIDIA V100 GPUs using the PyTorch [PGM+19] framework. On average, each model took 12 hours to train.
Software Dependencies	Yes	Our models are written in PyTorch [PGM+19] (version 1.10.0), Ray [MNW+18] (version 1.12.0), Tune [LLN+18] (version 1.12.0), and SciKit Learn [PVG+11] (version 1.0.2). Python version 3.8.10.
Experiment Setup	Yes	The batch size is 256. We use the Adam optimizer [KB14] with a learning rate of 0.001 and weight decay of 0.001. We use a 10% validation set from the training data for early stopping and hyperparameter optimization.