Double Machine Learning Density Estimation for Local Treatment Effects with Instruments
Authors: Yonghan Jung, Jin Tian, Elias Bareinboim
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The use of the proposed methods is illustrated through both synthetic and a real dataset called 401(k).We illustrate the proposed methods on synthetic and real data. |
| Researcher Affiliation | Academia | Yonghan Jung Purdue University jung222@purdue.edu, Jin Tian Iowa State University jtian@iastate.edu, Elias Bareinboim Columbia University eb@cs.columbia.edu |
| Pseudocode | No | The paper describes algorithmic steps but does not include a clearly labeled 'Pseudocode' or 'Algorithm' block. |
| Open Source Code | No | The paper does not provide any explicit statements about making its source code available or links to a code repository. |
| Open Datasets | Yes | In our analysis, we used the dataset introduced by [2] containing 9275 individuals, which has been studied in [2, 17, 5, 47, 58, 64], to cite a few. [2] A. Abadie. Semiparametric instrumental variable estimation of treatment response models. Journal of econometrics, 113(2):231 263, 2003. |
| Dataset Splits | No | The paper mentions 'randomly split halves of the samples' for the DML cross-fitting technique and 'separate validation data or applying cross-validation' for model selection, but it does not provide specific training/validation/test dataset splits (e.g., percentages or sample counts) for reproducibility. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU or CPU models used for running the experiments. |
| Software Dependencies | No | The paper mentions using 'XGBoost [11]' for nuisance estimation but does not provide specific version numbers for XGBoost or any other software dependencies. |
| Experiment Setup | Yes | We use the Gaussian kernel. The bandwidth is set to h = 0.5n 1/5. In estimating the density, we choose 200 equi-spaced points {y(i)}200 i=1 in Y and evaluate both estimators at Kh,y(i) for i = 1, , 200. We use KL divergence for Df and the normal distribution for g(y; β). For both approaches, nuisances are estimated through a gradient boosting model XGBoost [11], which is known to be flexible. |