Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Pessimistic Data Integration for Policy Evaluation
Authors: Xiangkun Wu, Ting Li, Gholamali Aminian, Armin Behnamnia, Hamid Rabiee, Chengchun Shi
NeurIPS 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Both our theoretical and numerical findings demonstrate that the proposed estimator achieves near-optimal performance across all scenarios. 5 Numerical experiments In this section, we evaluate the finite-sample performance of the proposed estimator, comparing it against EDO, MVE, CWE, and Non Pessi (introduced in Section 3.2). |
| Researcher Affiliation | Academia | Xiangkun Wu School of Mathematical Sciences Zhejiang University Hangzhou, China EMAIL Ting Li School of Statistics and Data Science Shanghai University of Finance and Economics Shanghai, China EMAIL Gholamali Aminian The Alan Turing Institute London, UK EMAIL Armin Behnamnia Sharif University of Technology Tehran, Iran EMAIL Hamid R. Rabiee Department of Computer Engineering Sharif University of Technology Tehran, Iran EMAIL Chengchun Shi Department of Statistics London School of Economics and Political Science London, UK EMAIL |
| Pseudocode | No | The paper describes the methodology in Section 3.2 'A pessimistic estimator for data integration' and Appendix B 'Implementation Details' through structured prose and mathematical equations, along with a workflow diagram (Figure 1). However, it does not contain a dedicated pseudocode or algorithm block. |
| Open Source Code | Yes | To further enhance reproducibility, we include the full source simulation code in the supplementary material, covering both the algorithm implementation and the experimental setup. We provide full access to the simulation code and experimental scripts in the supplementary material, along with detailed instructions for reproducing the main experimental results. |
| Open Datasets | Yes | Example 5.2 (Ridesharing-data-based simulation). In this example, we construct a simulation environment based on a real-world A/A dataset collected from a ridesharing platform. ... Example A.1 (Clinical-data-based simulation). In this section, we construct a simulation environment based on the public real-world dataset ACTG175, which consists of 2,139 HIV-positive individuals randomized to four treatments. |
| Dataset Splits | Yes | To simplify the theoretical analysis, we follow [2] and study a sample-split version of the ATE estimator where half of the data triplets in D(e) and D(h) are used to estimate bw by solving (6), while the remaining half are used to construct the ATE estimator in (4). ... The experimental dataset contains |De| = 48 samples, and the historical dataset has |Dh| = m |De|, with m {1, 2, 3}. |
| Hardware Specification | Yes | All experiments were conducted on a high-performance computing node equipped with dual AMD EPYC 7742 64-Core Processors (128 logical cores). No GPUs were used. |
| Software Dependencies | No | While not explicitly mentioned in the main text, we use standard Python packages such as Num Py,Sci Py ,and scikit-learn in our implementation. All usage complies with the respective license terms. |
| Experiment Setup | Yes | Details of the data generating process are provided in Appendix A. In our implementation, we estimate this nuisance function via logistic regression. The outcome functions r(h)(0, S(h)), r(e)(0, S(e)), and r(e)(1, S(e)) can be flexibly estimated using a variety of regression models, including basis function expansions, random forests, and neural networks. In our implementation, we parameterized w(S) using a logistic model, w(S) = 1 / (1 + e^(-θS)). |