reproducibilityindex.ai

Counterfactual Predictions under Runtime Confounding

Authors: Amanda Coston, Edward Kennedy, Alexandra Chouldechova

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate our methods against ground truth by performing experiments on simulated data, where we can vary the amount of confounding in order to assess the effect on predictive performance. We also present a validation procedure for evaluating the performance of counterfactual prediction methods.
Researcher Affiliation	Academia	Amanda Coston Heinz College & Machine Learning Department Carnegie Mellon University acoston@cs.cmu.edu Edward H. Kennedy Department of Statistics Carnegie Mellon University edward@stat.cmu.edu Alexandra Chouldechova Heinz College Carnegie Mellon University achould@cmu.edu
Pseudocode	Yes	Algorithm 1 The plug-in (PL) approach; Algorithm 2 The plug-in (PL) approach with cross-ﬁtting; Algorithm 3 The proposed doubly-robust (DR) approach; Algorithm 4 The proposed doubly-robust (DR) approach with cross ﬁtting; Algorithm 5 Cross-ﬁtting approach to evaluation of counterfactual prediction methods
Open Source Code	No	Source code is in the appendix and will be available at https://github.com/ (The URL provided is incomplete, and the phrasing 'will be available' indicates a future promise rather than concrete access at the time of publication).
Open Datasets	No	We simulate data as ... (details of simulation); Our dataset consists of over 30,000 calls to the hotline in Allegheny County, PA. We are grateful to Allegheny County Department of Human Services for sharing their data. (The paper describes simulated data generation and uses a real-world private dataset without providing public access links or formal citations for download).
Dataset Splits	Yes	We report test MSE for 300 simulations where each simulation generates n = 2000 data points split randomly and evenly into train and test sets. Randomly divide training data into two partitions W1 and W2. (Algorithm 2); Randomly divide training data into three partitions W1, W2, W3. (Algorithm 4)
Hardware Specification	No	The paper mentions capabilities of systems, such as 'existing case management software cannot run speech/NLP models in realtime', but does not specify any particular hardware (e.g., GPU models, CPU types, memory) used for conducting its experiments.
Software Dependencies	No	The paper refers to using 'LASSO' and 'random forests' as methods for estimation and regression, but it does not specify any software libraries or packages with version numbers (e.g., 'scikit-learn 0.24', 'PyTorch 1.9') that would be necessary for reproduction.
Experiment Setup	Yes	We simulate data as Vi N(0, 1) ; 1 i d V Zi N(ρVi, 1 ρ2) ; 1 i d Z µ(V, Z) = kv kv + ρkz i=1 Zi Y a = µ(V, Z) + ϵ ; ϵ N 2n µ(V, Z) 2 2 ν(V ) = kv kv + ρkz π(V, Z) = 1 σ A Bernoulli(π(V, Z)) where σ(x) = 1 1+e x . We normalize π(v, z) by 1 kv+kz to satisfy Condition 2.1.4 and use the coefﬁcient kv/(kv + ρkz) to facilitate a fair comparison as we vary ρ. In the ﬁrst set of experiments, for ﬁxed d = d V + d Z = 500, we vary d V (and correspondingly d Z). We also vary kz, which governs the runtime confounding. We use random forests in the ﬁrst stage for ﬂexibility and LASSO in the second stage for interpretability.