Scalable Sensitivity and Uncertainty Analyses for Causal-Effect Estimates of Continuous-Valued Interventions
Authors: Andrew Jesson, Alyson Douglas, Peter Manshausen, Maëlys Solal, Nicolai Meinshausen, Philip Stier, Yarin Gal, Uri Shalit
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Here we empirically validate our method. First, we consider a synthetic structural causal model (SCM) to demonstrate the validity of our method. Next, we show the scalability of our methods by applying them to a real-world climate-science-inspired problem. |
| Researcher Affiliation | Academia | Andrew Jesson OATML Department of Computer Science University of Oxford Alyson Douglas AOPP Department of Physics University of Oxford Peter Manshausen AOPP Department of Physics University of Oxford Maëlys Solal Department of Computer Science University of Oxford Nicolai Meinshausen Seminar for Statistics Department of Mathematics ETH Zurich Philip Stier AOPP Department of Physics University of Oxford Yarin Gal OATML Department of Computer Science University of Oxford Uri Shalit Machine Learning and Causal Inference in Healthcare Lab Technion Israel Institute of Technology |
| Pseudocode | Yes | Algorithm 1 Grid Search Interval Optimizer |
| Open Source Code | Yes | Implementation details (appendix H), datasets (appendix G), and code are provided at https://github.com/oatml/overcast. |
| Open Datasets | Yes | The datasets used in this work are publicly available. We use daily observed 1 1 means of clouds, aerosol, and the environment from sources shown in Table 1 of Appendix G. MODIS data from the NASA Earthdata system [BP06]... MERRA-2 reanalysis data [BAC+15, GMS+17]... |
| Dataset Splits | Yes | For the climate dataset, we use an 80/20 train/test split. The dataset contains 3 million daily observations for 15 years, so this gives us a test set of 600,000 observations. We use a 10% validation set from the training data for early stopping and hyperparameter optimization. |
| Hardware Specification | Yes | All of our models are trained on NVIDIA V100 GPUs using the PyTorch [PGM+19] framework. On average, each model took 12 hours to train. |
| Software Dependencies | Yes | Our models are written in PyTorch [PGM+19] (version 1.10.0), Ray [MNW+18] (version 1.12.0), Tune [LLN+18] (version 1.12.0), and SciKit Learn [PVG+11] (version 1.0.2). Python version 3.8.10. |
| Experiment Setup | Yes | The batch size is 256. We use the Adam optimizer [KB14] with a learning rate of 0.001 and weight decay of 0.001. We use a 10% validation set from the training data for early stopping and hyperparameter optimization. |