reproducibilityindex.ai

On the Estimation of Treatment Effect with Text Covariates

Authors: Liuyi Yao, Sheng Li, Yaliang Li, Hongfei Xue, Jing Gao, Aidong Zhang

IJCAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	To evaluate the effectiveness of the proposed method, we ﬁrst conduct experiments on two semi-synthetic datasets. Experimental results show that the proposed method outperforms the state-of-the-art methods. Furthermore, in the real world dataset, we verify the matching quality, and demonstrate that by imposing the conditional treatment adversarial training, the dependency between the treatment assignment and the nearly instrumental variables are removed.
Researcher Affiliation	Collaboration	Liuyi Yao1 , Sheng Li2 , Yaliang Li3 , Hongfei Xue1 , Jing Gao1 and Aidong Zhang4 1University at Buffalo 2University of Georgia 3Alibaba Group 4University of Virginia liuyiyao@buffalo.edu, sheng.li@uga.edu, yaliang.li@alibaba-inc.com, {hongfeix, jing}@buffalo.edu, aidong@virginia.edu
Pseudocode	No	The paper describes the model and training procedure using text and mathematical equations, but it does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide any statement about releasing source code or a link to a code repository for the described methodology.
Open Datasets	Yes	The News dataset is ﬁrst introduced in [Johansson et al., 2016]... IHDP Dataset is from the Infant Health and Development Program [Brooks-Gunn et al., 1992]... The dataset comes from Consumer Financial Protection Bureau (CFPB)3. 3https://www.consumerﬁnance.gov/data-research/ consumer-complaints/
Dataset Splits	No	The paper mentions generating samples and dataset sizes, but it does not provide explicit details about train/validation/test splits, percentages, or sample counts used for reproduction. For instance, it states "We generate 1,000 samples for each realization" and provides total record counts, but no specific split methodology.
Hardware Specification	No	The paper does not explicitly describe any specific hardware components (e.g., CPU, GPU models, memory, cloud instances) used for running its experiments.
Software Dependencies	No	The paper mentions various methods and tools like GloVe, word2vec, and the NPCI package, but it does not provide specific version numbers for any software dependencies required to replicate the experiments.
Experiment Setup	No	The paper states that "The parameters of baselines are set as suggested by the original papers, and the hyper-parameter search of CTAM follows the scheme in [Shalit et al., 2017]". It mentions learning rates (ηΦ, ηΨ, and ηD) and hyperparameters (λ, α, β) in the loss function, but it does not provide their specific values or detailed configuration settings for replication.