Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Budgeted Heterogeneous Treatment Effect Estimation
Authors: Tian Qin, Tian-Zuo Wang, Zhi-Hua Zhou
ICML 2021 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments across three datasets show that our method outperforms baselines given a fixed observational data budget. |
| Researcher Affiliation | Academia | 1National Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, China. |
| Pseudocode | Yes | Algorithm 1 Core Set |
| Open Source Code | No | The paper does not provide any explicit statement or link to open-source code for the described methodology. |
| Open Datasets | Yes | IHDP. This is a common benchmark dataset introduced by Hill (2011). |
| Dataset Splits | Yes | We average over 1,000 realizations of the outcomes with 63/27/10 train/validation/test splits. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU models, CPU types, or memory specifications used for running experiments. |
| Software Dependencies | No | The paper mentions several algorithms and frameworks like CFR, stochastic gradient descent, and the Sinkhorn-Knopp algorithm, but does not provide specific version numbers for any software dependencies or libraries. |
| Experiment Setup | Yes | QHTE uses 3 layers to parameterize the representation mapping function Φ, and 3 layers for the outcome prediction function f. Layer sizes are 200 for each of the first 3 layers, and 100 for others. All but the output layer use Re LU (Rectified Linear Unit) (Agarap, 2018) as activation functions, and use batch normalization (Ioffe & Szegedy, 2015) to facilitate training. We use stochastic gradient descent with an initial learning rate of 0.001 and a batch size of 100 to train the network. The learning rate decays with a factor of 0.1 when the validation error plateaus. We set α = 1 10 4 and γ = 1. |