Valid Causal Inference with (Some) Invalid Instruments

Authors: Jason S Hartford, Victor Veitch, Dhanya Sridhar, Kevin Leyton-Brown

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimentally, we achieve accurate estimates of conditional average treatment effects using an ensemble of deep network-based estimators, including on a challenging simulated Mendelian randomization problem. We studied Mode IV empirically in two simulation settings.
Researcher Affiliation Academia 1University of British Columbia, Vancou ver, Canada 2University of Chicago, Illinois, USA 3Columbia University, New York, USA.
Pseudocode Yes The Mode IV procedure requires the analyst to specify a lower bound V ? 2 on the number of valid instruments and then proceeds in three steps. 1. Fit an ensemble of k estimates of the conditional out come {fˆ1, . . . , fˆk} using a non-linear IV procedure ap plied to each of the k instruments. 2. For a given test point (t, x), select [ˆl, uˆ] as the smallest interval containing V of the estimates {fˆ1(t, x), . . . , fˆk(t, x)}. Define Iˆmode = {i : ˆl fˆ i(t, x) uˆ} to be the indices of the instruments cor responding to estimates falling in the interval. 3. Return fˆmode(t, x) = P i2ˆ fˆi(t, x) |Iˆmode| Imode. Figure 1 shows this procedure graphically.
Open Source Code Yes See the appendix for an efficient Pytorch (Paszke et al., 2019) implementation
Open Datasets No The paper describes how it *simulates* the data for experiments by modifying existing simulations from Hartford et al. (2017) and Hartwig et al. (2017), but it does not provide access information (link, DOI, specific repository) to the *specific simulated datasets* used for its experiments.
Dataset Splits No The paper refers to 'test point' and 'training' but does not explicitly provide specific dataset split information (percentages, counts, or detailed methodology) for training, validation, or testing.
Hardware Specification No The paper mentions support from 'Compute Canada' and a 'GPU grant from NVIDIA' but does not specify any particular hardware details such as exact GPU models, CPU models, or memory amounts used for the experiments.
Software Dependencies Yes See the appendix for an efficient Pytorch (Paszke et al., 2019) implementation
Experiment Setup No The paper describes the general model used (Deep IV) and some architectural choices, such as using a neural network to parameterize the slope, but it does not provide specific hyperparameter values like learning rate, batch size, or optimizer settings, nor other system-level training configurations in the main text.