Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Treatment Effect Estimation with Data-Driven Variable Decomposition

Authors: Kun Kuang, Peng Cui, Bo Li, Meng Jiang, Shiqiang Yang, Fei Wang

AAAI 2017 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We apply our algorithm on the synthetic dataset and real online advertising dataset to estimate the ATE. ... From Tab.1, we have following observations. ... Third, our D2V D( ) estimator, which has no variables separation step, can get the similar results with DR estimator. But with considering the separation between confounders and adjustment variables, our D2V D estimator can improve the accuracy (smaller Bias) and reduce the variance (smaller SD) for ATE estimation from D2V D( ), DR and other baseline estimators under different settings.
Researcher Affiliation	Academia	1Tsinghua National Laboratory for Information Science and Technology 2Department of Computer Science and Technology, Tsinghua University 3School of Economics and Management, Tsinghua University 4Department of Computer Science, University of Illinois Urbana-Champaign 5Department of Healthcare Policy and Research, Weill Cornell Medical School, Cornell University
Pseudocode	Yes	Algorithm 1 Data-Driven Variable Decomposition (D2VD)
Open Source Code	No	The paper does not provide any explicit statement about releasing source code for their D2VD algorithm, nor does it provide a link to a code repository.
Open Datasets	No	The paper mentions using a 'real online advertising dataset... collected during Sep. 2015 from Tencent We Chat App'. While it provides details about the dataset's nature and size, it does not provide concrete access information (e.g., link, DOI, or formal citation for a public repository) for this dataset. The synthetic dataset is generated programmatically rather than being a pre-existing public resource.
Dataset Splits	No	The paper does not explicitly provide specific training/test/validation dataset splits (e.g., percentages, sample counts, or cross-validation setup) for either the synthetic or real-world datasets used in the experiments.
Hardware Specification	No	The paper does not provide any specific hardware details (e.g., CPU or GPU models, memory, or cloud instance types) used for running the experiments.
Software Dependencies	No	The paper does not specify any software dependencies with version numbers (e.g., specific libraries, frameworks, or programming language versions) used for implementation.
Experiment Setup	Yes	During the parameters tuning, we set the matching threshold ϵ = 5, which make the matching estimator is close to the exactly matching. The hyper-parameters of λ, δ, τ, η and μ set as 30, 50, 90, 70 and 30 by using grid search.