On the Estimation of Treatment Effect with Text Covariates
Authors: Liuyi Yao, Sheng Li, Yaliang Li, Hongfei Xue, Jing Gao, Aidong Zhang
IJCAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | To evaluate the effectiveness of the proposed method, we first conduct experiments on two semi-synthetic datasets. Experimental results show that the proposed method outperforms the state-of-the-art methods. Furthermore, in the real world dataset, we verify the matching quality, and demonstrate that by imposing the conditional treatment adversarial training, the dependency between the treatment assignment and the nearly instrumental variables are removed. |
| Researcher Affiliation | Collaboration | Liuyi Yao1 , Sheng Li2 , Yaliang Li3 , Hongfei Xue1 , Jing Gao1 and Aidong Zhang4 1University at Buffalo 2University of Georgia 3Alibaba Group 4University of Virginia liuyiyao@buffalo.edu, sheng.li@uga.edu, yaliang.li@alibaba-inc.com, {hongfeix, jing}@buffalo.edu, aidong@virginia.edu |
| Pseudocode | No | The paper describes the model and training procedure using text and mathematical equations, but it does not include any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide any statement about releasing source code or a link to a code repository for the described methodology. |
| Open Datasets | Yes | The News dataset is first introduced in [Johansson et al., 2016]... IHDP Dataset is from the Infant Health and Development Program [Brooks-Gunn et al., 1992]... The dataset comes from Consumer Financial Protection Bureau (CFPB)3. 3https://www.consumerfinance.gov/data-research/ consumer-complaints/ |
| Dataset Splits | No | The paper mentions generating samples and dataset sizes, but it does not provide explicit details about train/validation/test splits, percentages, or sample counts used for reproduction. For instance, it states "We generate 1,000 samples for each realization" and provides total record counts, but no specific split methodology. |
| Hardware Specification | No | The paper does not explicitly describe any specific hardware components (e.g., CPU, GPU models, memory, cloud instances) used for running its experiments. |
| Software Dependencies | No | The paper mentions various methods and tools like GloVe, word2vec, and the NPCI package, but it does not provide specific version numbers for any software dependencies required to replicate the experiments. |
| Experiment Setup | No | The paper states that "The parameters of baselines are set as suggested by the original papers, and the hyper-parameter search of CTAM follows the scheme in [Shalit et al., 2017]". It mentions learning rates (ηΦ, ηΨ, and ηD) and hyperparameters (λ, α, β) in the loss function, but it does not provide their specific values or detailed configuration settings for replication. |