Counterfactual Predictions under Runtime Confounding

Authors: Amanda Coston, Edward Kennedy, Alexandra Chouldechova

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate our methods against ground truth by performing experiments on simulated data, where we can vary the amount of confounding in order to assess the effect on predictive performance. We also present a validation procedure for evaluating the performance of counterfactual prediction methods.
Researcher Affiliation Academia Amanda Coston Heinz College & Machine Learning Department Carnegie Mellon University acoston@cs.cmu.edu Edward H. Kennedy Department of Statistics Carnegie Mellon University edward@stat.cmu.edu Alexandra Chouldechova Heinz College Carnegie Mellon University achould@cmu.edu
Pseudocode Yes Algorithm 1 The plug-in (PL) approach; Algorithm 2 The plug-in (PL) approach with cross-fitting; Algorithm 3 The proposed doubly-robust (DR) approach; Algorithm 4 The proposed doubly-robust (DR) approach with cross fitting; Algorithm 5 Cross-fitting approach to evaluation of counterfactual prediction methods
Open Source Code No Source code is in the appendix and will be available at https://github.com/ (The URL provided is incomplete, and the phrasing 'will be available' indicates a future promise rather than concrete access at the time of publication).
Open Datasets No We simulate data as ... (details of simulation); Our dataset consists of over 30,000 calls to the hotline in Allegheny County, PA. We are grateful to Allegheny County Department of Human Services for sharing their data. (The paper describes simulated data generation and uses a real-world private dataset without providing public access links or formal citations for download).
Dataset Splits Yes We report test MSE for 300 simulations where each simulation generates n = 2000 data points split randomly and evenly into train and test sets. Randomly divide training data into two partitions W1 and W2. (Algorithm 2); Randomly divide training data into three partitions W1, W2, W3. (Algorithm 4)
Hardware Specification No The paper mentions capabilities of systems, such as 'existing case management software cannot run speech/NLP models in realtime', but does not specify any particular hardware (e.g., GPU models, CPU types, memory) used for conducting its experiments.
Software Dependencies No The paper refers to using 'LASSO' and 'random forests' as methods for estimation and regression, but it does not specify any software libraries or packages with version numbers (e.g., 'scikit-learn 0.24', 'PyTorch 1.9') that would be necessary for reproduction.
Experiment Setup Yes We simulate data as Vi N(0, 1) ; 1 i d V Zi N(ρVi, 1 ρ2) ; 1 i d Z µ(V, Z) = kv kv + ρkz i=1 Zi Y a = µ(V, Z) + ϵ ; ϵ N 2n µ(V, Z) 2 2 ν(V ) = kv kv + ρkz π(V, Z) = 1 σ A Bernoulli(π(V, Z)) where σ(x) = 1 1+e x . We normalize π(v, z) by 1 kv+kz to satisfy Condition 2.1.4 and use the coefficient kv/(kv + ρkz) to facilitate a fair comparison as we vary ρ. In the first set of experiments, for fixed d = d V + d Z = 500, we vary d V (and correspondingly d Z). We also vary kz, which governs the runtime confounding. We use random forests in the first stage for flexibility and LASSO in the second stage for interpretability.