Interpretable Off-Policy Learning via Hyperbox Search
Authors: Daniel Tschernutter, Tobias Hatt, Stefan Feuerriegel
ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Using a simulation study, we demonstrate that our algorithm outperforms stateof-the-art methods from interpretable off-policy learning in terms of regret. Using real-word clinical data, we perform a user study with actual clinical experts, who rate our policies as highly interpretable. |
| Researcher Affiliation | Academia | 1ETH Zurich, Switzerland 2LMU, Germany. |
| Pseudocode | Yes | Algorithm 1 IOPL |
| Open Source Code | Yes | We provide a publicly available implementation of IOPL in Python. For solving the LP (linear program) relaxations and the pricing problem, we use Gurobi 9.0. We stop IOPL if it exceeds l = 50 branchand-bound iterations as given in Algorithm 1. We set a maximum time limit of 180 seconds for solving the pricing problem in our experiments. We emphasize that this time limit was never exceeded in our experiments, including the experiments with the real-world clinical data (see Section 4.5.2 for a discussion of the reasons).5Code available at https://github.com/ Daniel Tschernutter/IOPL |
| Open Datasets | Yes | We draw upon the AIDS Clinical Trial Group (ACTG) study 175 (Hammer et al., 1996). |
| Dataset Splits | Yes | For all baselines, we use 80% of the data for training and 20% for validation. |
| Hardware Specification | Yes | We run all of our experiments on a server with two 16-core Intel Xeon Gold 6242 processors each with 2.8GHz and 192GB of RAM. |
| Software Dependencies | Yes | We provide a publicly available implementation of IOPL in Python. For solving the LP (linear program) relaxations and the pricing problem, we use Gurobi 9.0. |
| Experiment Setup | Yes | The hyperparameters are given in Table 2. Table 2: Hyperparameter Grids (e.g., 'initial learning rate {10^-1, 10^-2, 10^-3, 10^-4}', 'batch size {128, full}', 'regularization parameter ρ {10^-2, 10^-3, 10^-4}') |