A Reinforcement Learning Framework for Dynamic Mediation Analysis
Authors: Lin Ge, Jitao Wang, Chengchun Shi, Zhenke Wu, Rui Song
ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The superior performance of the proposed method is demonstrated through extensive numerical studies, theoretical results, and an analysis of a mobile health dataset. |
| Researcher Affiliation | Academia | 1North Carolina State University 2University of Michigan, Ann Arbor 3London School of Economics and Political Science. |
| Pseudocode | No | No pseudocode or algorithm blocks were explicitly labeled or presented in the paper. |
| Open Source Code | Yes | A Python implementation of the proposed procedure is available at https://github.com/linlinlin97/Mediation RL. |
| Open Datasets | Yes | In this section, we apply the proposed MR estimators to analyze the real dataset from the IHS (Ne Camp et al., 2020) |
| Dataset Splits | Yes | It is worth noting that we used cross-validation to estimate the ATE of πopt. |
| Hardware Specification | No | No specific hardware details (e.g., GPU/CPU models, memory specifications, or cluster configurations) used for running experiments are provided in the paper. |
| Software Dependencies | No | The paper mentions 'A Python implementation' but does not specify version numbers for Python or any other software dependencies, libraries, or solvers used for the experiments. |
| Experiment Setup | Yes | We consider a scenario with discrete states, actions, mediators, and rewards. We set time T = 50, and S0 for each trajectory is sampled from a Bernoulli distribution with a mean probability of 0.5. Denote the sigmoid function as expit( ). Following the behavior policy, the action At {0, 1} is sampled from a Bernoulli distribution, where Pr(At = 1|St) = expit(1.0 2.0St). Observing St and At, the mediator Mt {0, 1} is drawn from a Bernoulli distribution with Pr(Mt = 1|St, At) = expit(1.0 1.5St +2.5At). |