Matching on Balanced Nonlinear Representations for Treatment Effects Estimation

Authors: Sheng Li, Yun Fu

NeurIPS 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on several synthetic and real-world datasets demonstrate the effectiveness of our approach.
Researcher Affiliation Collaboration Sheng Li Adobe Research San Jose, CA sheli@adobe.com Yun Fu Northeastern University Boston, MA yunfu@ece.nu.edu
Pseudocode Yes Algorithm 1. BNR-NNM
Open Source Code No The paper does not contain an explicit statement offering the source code for the described methodology, nor does it provide a link to a code repository.
Open Datasets Yes Synthetic Dataset. Data Generation. We generate a synthetic dataset by following the protocols described in [41, 25]. IHDP Dataset with Simulated Outcomes. IHDP data [16] is an experimental dataset collected by the Infant Health and Development Program. La Londe Dataset with Real Outcomes. The La Londe dataset is a widely used benchmark for observational studies [23].
Dataset Splits No The paper mentions training, but it does not provide specific details about validation splits (e.g., percentages, absolute counts, or methods like k-fold cross-validation for hyperparameter tuning) for its experiments. It does mention cross-validation as a strategy to select parameters, but not in the context of a fixed validation split.
Hardware Specification No The paper does not provide specific details about the hardware used for running the experiments (e.g., GPU/CPU models, memory). General computing environments or vague terms like 'on a server' are not present either.
Software Dependencies No The paper does not list specific software dependencies with version numbers (e.g., Python 3.8, PyTorch 1.9). It mentions logistic regression and kernel functions, but not the specific software implementations or versions.
Experiment Setup Yes The major parameters in BNR-NNM include α, β, and c. In the experiments, α is empirically set to 1. β is chosen from {10 3, 10 1, 1, 10, 103}. The number of categories c is chosen from {2, 4, 6, 8}. We use the Gaussian kernel function k(xi, xj) = exp( xi xj 2/2σ2), in which the bandwidth parameter σ is empirically set to 5.