Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Sample Constrained Treatment Effect Estimation
Authors: Raghavendra Addanki, David Arbour, Tung Mai, Cameron Musco, Anup Rao
NeurIPS 2022 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We also empirically demonstrate the performance of our algorithms. Finally, in Section 5, we provide an empirical evaluation of the performance of our ITE and ATE estimation methods, comparing against uniform sampling and other baselines on several datasets. |
| Researcher Affiliation | Collaboration | Raghavendra Addanki Adobe Research EMAIL David Arbour Adobe Research EMAIL Tung Mai Adobe Research EMAIL Cameron Musco University of Massachusetts Amherst EMAIL Anup Rao Adobe Research EMAIL |
| Pseudocode | Yes | Algorithm 1 SAMPLING-ITE |
| Open Source Code | Yes | Our code is publicly accessible using the following github repository. |
| Open Datasets | Yes | We evaluate our approaches on five datasets: (i) IHDP. ... (ii) Twins. ... (iii) La Londe. ... (iv) Boston. ... (v) Synthetic. ... For IHDP and Twins datasets, we use the simulated values for potential outcomes, similar to Shalit et al. [43] and Louizos et al. [32]. For the Synthetic dataset, we simulate values for the outcomes using linear functions of the covariate matrix. For Boston, Lalonde datasets, as we have access to only one of the outcome values, we chose to compare our algorithms for a fixed shift in treatment effect (i.e., the true treatment effect is equal to a constant), similar to Arbour et al. [6]. |
| Dataset Splits | No | The paper describes using various datasets and experimenting with different 'sample sizes (as proportion of dataset size)', but it does not specify explicit train/validation/test dataset splits with percentages or sample counts for reproducing experiments on the full datasets. |
| Hardware Specification | No | The paper mentions running experiments but does not provide specific hardware details such as GPU/CPU models, memory, or specific cloud/cluster configurations used for computation. |
| Software Dependencies | No | The paper does not specify any software dependencies with version numbers (e.g., Python, PyTorch, or specific libraries) used in the experiments. |
| Experiment Setup | No | The paper describes the datasets used and the baselines for comparison, along with the evaluation metrics, but it does not provide specific experimental setup details such as hyperparameter values (e.g., learning rate, batch size, epochs) or detailed training configurations for the underlying models. |