Kernel Instrumental Variable Regression

Authors: Rahul Singh, Maneesh Sahani, Arthur Gretton

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In experiments, KIV outperforms state of the art alternatives for nonparametric IV regression. We compare the empirical performance of KIV (Kernel IV) to four leading competitors: standard kernel ridge regression (Kernel Reg) [50], Nadaraya-Watson IV (Smooth IV) [16, 23], sieve IV (Sieve IV) [48, 17], and deep IV (Deep IV) [36].
Researcher Affiliation Academia Rahul Singh MIT Economics rahul.singh@mit.edu Maneesh Sahani Gatsby Unit, UCL maneesh@gatsby.ucl.ac.uk Arthur Gretton Gatsby Unit, UCL arthur.gretton@gmail.com
Pseudocode Yes Algorithm 1. Let X and Z be matrices of n observations. Let y and Z be a vector and matrix of m observations. W = KXX(KZZ + nλI) 1KZ Z, ˆ = (WW 0 + m KXX) 1W y, ˆhm (x) = (ˆ )0KXx where KXX and KZZ are the empirical kernel matrices.
Open Source Code Yes Code: https://github.com/r4hu1-5in9h/KIV
Open Datasets No We implement each estimator on three designs. The linear design [17] involves learning counterfactual function h(x) = 4x 2, given confounded observations of continuous variables (X, Y ) as well as continuous instrument Z. The sigmoid design [17] involves learning counterfactual function h(x) = ln(|16x 8| + 1) sgn(x 0.5) under the same regime. The demand design [36] involves learning demand function h(p, t, s) = 100 + (10 + p) s (t) 2p... The paper references designs from other papers but does not provide direct links or access information for these datasets.
Dataset Splits No Sample splitting in this context means estimating stage 1 with n randomly chosen observations and estimating stage 2 with the remaining m observations. In Appendix A.5.2, we provide a validation procedure to empirically determine values for (λ, ). While sample splitting is mentioned and a validation procedure is noted, the paper does not specify the explicit training/validation/test dataset percentages, absolute sample counts, or reference predefined splits for the empirical experiments.
Hardware Specification No The paper does not provide any specific hardware details such as GPU models, CPU types, or memory specifications used for running the experiments.
Software Dependencies No The paper does not provide specific ancillary software details with version numbers, such as programming language versions or library versions.
Experiment Setup No For each algorithm, design, and sample size, we implement 40 simulations and calculate MSE with respect to the true structural function h. In Appendix A.11 for representative plots, implementation details, and a robustness study. The main text refers to implementation details being in the appendix, but does not provide specific hyperparameter values, training configurations, or system-level settings within the main body.