Collapsing Bandits and Their Application to Public Health Intervention
Authors: Aditya Mate, Jackson Killian, Haifeng Xu, Andrew Perrault, Milind Tambe
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate our algorithm on several data distributions including data from a real-world healthcare task in which a worker must monitor and deliver interventions to maximize their patients adherence to tuberculosis medication. Our algorithm achieves a 3-order-of-magnitude speedup compared to state-of-the-art RMAB techniques, while achieving similar performance. |
| Researcher Affiliation | Academia | Aditya Mate Harvard University Cambridge, MA, 02138 aditya_mate@g.harvard.edu Jackson A. Killian Harvard University Cambridge, MA, 02138 jkillian@g.harvard.edu Haifeng Xu University of Virginia Charlottesville, VA, 22903 hx4ad@virginia.edu Andrew Perrault Harvard University Cambridge, MA, 02138 aperrault@g.harvard.edu Milind Tambe Harvard University Cambridge, MA, 02138 milind_tambe@harvard.edu |
| Pseudocode | Yes | Algorithm 1: Sequential index computation algorithm |
| Open Source Code | Yes | The code is available at: https://github.com/Aditya Mate/collapsing_bandits |
| Open Datasets | Yes | We first test on tuberculosis medication adherence monitoring data, which contains daily adherence information recorded for each real patient in the system, as obtained from Killian et al. [17]. |
| Dataset Splits | No | The paper does not explicitly state specific training, validation, or test dataset splits (e.g., percentages or sample counts). It mentions using real-world data and synthetic distributions for evaluation. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for running the experiments (e.g., CPU, GPU models, or memory specifications). |
| Software Dependencies | No | The paper does not provide specific version numbers for any software dependencies (e.g., libraries, frameworks, or programming languages). |
| Experiment Setup | Yes | Reward is measured as the undiscounted sum of patients (arms) in the adherent state over all rounds, where each trial lasts T = 180 days (matching the length of first-line TB treatment) with N patients and a budget of k calls per day. All experiments in this section set all δ to 0.05. ... We set the resource level, k = 10%N in our simulation for Fig. 5a. |