reproducibilityindex.ai

Deep Reinforcement Learning for Cost-Effective Medical Diagnosis

Authors: Zheng Yu, Yikuan Li, Joseph Chahn Kim, Kaixuan Huang, Yuan Luo, Mengdi Wang

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments with real-world data validate that SM-DDPO trains efficiently and identify all Pareto-front solutions. Across all tasks, SM-DDPO is able to achieve state-of-the-art diagnosis accuracy (in some cases higher than conventional methods) with up to 85% reduction in testing cost. Core codes are available on Git Hub1.
Researcher Affiliation	Academia	Zheng Yu Princeton University Yikuan Li Northwestern University Joseph C. Kim Princeton University Kaixuan Huang Princeton University Yuan Luo Northwestern University Mengdi Wang Princeton University
Pseudocode	Yes	Algorithm 1 Semi-Model-Based Deep Diagnosis Policy Optimization (SM-DDPO)
Open Source Code	Yes	Core codes are available on Git Hub1. 1https://github.com/Zheng321/Deep-Reinforcement-Learning-for-Cost-Effective-Medical-Diagnosis
Open Datasets	Yes	We followed steps in Zimmerman et al. (2019) to extract 23,950 ICU visits of 19,811 patients from the MIMIC-III dataset Johnson et al. (2016)
Dataset Splits	Yes	We split each dataset into 3 parts: training set (75%), validation set (15%), and test set (10%).
Hardware Specification	No	Quest provides computing access of over 11,800 CPU cores. In our experiment, we deploy each model training job to one CPU, so that multiple configurations can be tested simultaneously. Each training job requires a wall-time less than 2 hours of a single CPU core.
Software Dependencies	No	Our codes used the implementation of PPO algorithm in package Raffin et al. (2021). We use the Python package stable-baseline3 Raffin et al. (2021) for implementing PPO.
Experiment Setup	Yes	We list all the hyper-parameters we tuned in Table 7, including both the tuning range and final selection. Table 7 lists specific values like "Batch size 256", "Learning rate 1e-3", "Hidden size (3-layer) 64", "# Timesteps per update 1024".