Designing Optimal Dynamic Treatment Regimes: A Causal Reinforcement Learning Approach

Authors: Junzhe Zhang

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our results are validated on multi-stage treatments regimes for lung cancer and dyspnoea. We evaluate the new algorithms on several SCMs, including multi-stage treatment regimes for lung cancer (Nease Jr & Owens, 1997) and dyspnoea (Cowell et al., 2006). We found that the new algorithms consistently outperform the stateof-art methods in terms of both the online performance and the efficiency of utilizing the observational data.
Researcher Affiliation Academia 1Department of Computer Science, Columbia University, New York, USA.
Pseudocode Yes We propose an efficient procedure (Alg. 1) reducing the dimensionality of candidate policy space by exploiting the functional and independence restrictions encoded in the causal diagram. (2) We develope two novel online reinforcement learning algorithms (Algs. 2 and 3) for identifying the optimal DTR, leveraging the causal diagram, and that consistently dominate the state-of-art methods in terms of the performance.
Open Source Code No The paper does not provide an explicit statement or link to its open-source code. It links to a technical report PDF: 'Zhang, J. and Bareinboim, E. Designing optimal dynamic treatment regimes: A causal reinforcement learning approach. Technical Report R-57, Causal Artificial Intelligence Lab, Columbia University, 2020. URL https://causalai.net/r47-full.pdf.'
Open Datasets Yes We test the model of treatment regimes for lung cancer described in (Nease Jr & Owens, 1997) and dyspnoea (Cowell et al., 2006).
Dataset Splits No The paper does not explicitly provide specific dataset split information (e.g., percentages, sample counts) for training, validation, or testing.
Hardware Specification No The paper does not provide specific hardware details such as GPU or CPU models, memory, or cloud computing specifications used for running the experiments.
Software Dependencies No The paper does not provide specific software dependencies with version numbers (e.g., programming languages, libraries, or frameworks with their versions).
Experiment Setup No The paper states, 'We refer readers to (Zhang & Bareinboim, 2020, Appendix E) for more details on the experiments,' indicating that specific experimental setup details, such as hyperparameters or training configurations, are not present in the main text.