Minimax Optimal Nonparametric Estimation of Heterogeneous Treatment Effects
Authors: Zijun Gao, Yanjun Han
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We illustrate the efficacy of the proposed estimator in Algorithm 2 via some numerical experiments. Specifically, we aim to show that the two main ingredients of Algorithm 2, i.e. constructing pseudoobservations based on covariate matching and discarding observations with poor matching quality, are key to improved HTE estimation. We compare our estimator (which we call selected matching) with the following three estimators: the full matching estimator which never discards samples (i.e. m2 = m1 always holds in Algorithm 2), the k NN differencing and kernel differencing estimators which apply separate k-NN or kernel estimates to both baselines and then take the difference. The performance of HTE estimation is measured via the root mean squared error (RMSE) averaged over 100 simulations. The experimental results are displayed in Figures 1 and 2 |
| Researcher Affiliation | Academia | Zijun Gao Department of Statistics Stanford University Email: zijungao@stanford.edu Yanjun Han Department of Electrical Engineering Stanford Univeristy Email: yjhan@stanford.edu |
| Pseudocode | Yes | Algorithm 1 Estimator Construction under Fixed Design Algorithm 2 Estimator Construction under Random Design |
| Open Source Code | Yes | The source codes are available at https://github.com/Mathegineer/Nonparametric_HTE. |
| Open Datasets | No | The paper uses synthetically generated data for its experiments, as described by: 'For each given (n, d, κ, σ), we generate n control covariates X0 1, , X0 n following the i.i.d. density g0(x)... Similarly, the treatment covariates X1 1, , X1 n are i.i.d. generated following the density g1(x) = 2 g0(x), and the responses Y 0 i , Y 1 i are defined in (1) with i.i.d. N(0, σ2) noises.' It does not refer to or provide access information for a pre-existing public dataset. |
| Dataset Splits | No | The paper uses synthetically generated data for each simulation and does not describe explicit train/validation/test splits of a dataset. |
| Hardware Specification | No | The paper does not provide any specific hardware details (e.g., CPU, GPU models, memory) used for running its experiments. |
| Software Dependencies | No | The paper provides a link to source code but does not list specific software dependencies with version numbers within its text. |
| Experiment Setup | No | The paper describes the input parameter settings for the simulations (n, d, κ, σ) and states that 'The algorithm parameters are determined by the optimal bias-variance tradeoffs in theory.' However, it does not provide concrete hyperparameter values or detailed training configurations (e.g., specific m1, m2 values chosen for the experiments, or other optimization settings) for their implemented algorithms. |