Two-way Deconfounder for Off-policy Evaluation in Causal Reinforcement Learning

Authors: Shuguang Yu, Shuxing Fang, Ruixin Peng, Zhengling Qi, Fan Zhou, Chengchun Shi

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We illustrate the effectiveness of the proposed estimator through theoretical results and numerical experiments.
Researcher Affiliation Academia Shuguang Yu School of Statistics and Management Shanghai University of Finance and Economics Shanghai, China Shuxing Fang Department of Applied Mathematics The Hong Kong Polytechnic University Hong Kong, China Ruixin Peng School of Statistics and Management Shanghai University of Finance and Economics Shanghai, China Zhengling Qi Department of Decision Sciences George Washington University Washington D.C., USA Fan Zhou School of Statistics and Management Shanghai University of Finance and Economics Shanghai, China Chengchun Shi Department of Statistics London School of Economics and Political Science London, UK
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code Yes The source code is available on Github: https://github.com/fsmiu/Two-way Deconfounder.
Open Datasets Yes The data is subject to an agreement and cannot be shared due to its sensitivity, but it is publicly available at https://physionet.org/content/mimiciii/1.4/.
Dataset Splits Yes Each dataset undergoes a 75/25 split for training, validation respectively.
Hardware Specification Yes The Two-way Deconfounder Model described in Section 3 was implemented in Pytorch and trained on an NVIDIA Ge Force RTX 3090.
Software Dependencies No The Two-way Deconfounder Model described in Section 3 was implemented in Pytorch... (No version number for Pytorch or other dependencies is provided).
Experiment Setup Yes The search range for each hyperparameter is described as follows, learning rate lr [0.005, 0.001], Batch size bs [28, 29, 210, 211, 212], weight decay λ [0.01, 0.0001], two-way embedding dimension dtw [21, 22, 23], Loss weighting α [0.0, 0.3, 0.5, 0.7].