Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Interpretable Off-Policy Learning via Hyperbox Search
Authors: Daniel Tschernutter, Tobias Hatt, Stefan Feuerriegel
ICML 2022 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Using a simulation study, we demonstrate that our algorithm outperforms stateof-the-art methods from interpretable off-policy learning in terms of regret. Using real-word clinical data, we perform a user study with actual clinical experts, who rate our policies as highly interpretable. |
| Researcher Affiliation | Academia | 1ETH Zurich, Switzerland 2LMU, Germany. |
| Pseudocode | Yes | Algorithm 1 IOPL |
| Open Source Code | Yes | We provide a publicly available implementation of IOPL in Python. For solving the LP (linear program) relaxations and the pricing problem, we use Gurobi 9.0. We stop IOPL if it exceeds l = 50 branchand-bound iterations as given in Algorithm 1. We set a maximum time limit of 180 seconds for solving the pricing problem in our experiments. We emphasize that this time limit was never exceeded in our experiments, including the experiments with the real-world clinical data (see Section 4.5.2 for a discussion of the reasons).5Code available at https://github.com/ Daniel Tschernutter/IOPL |
| Open Datasets | Yes | We draw upon the AIDS Clinical Trial Group (ACTG) study 175 (Hammer et al., 1996). |
| Dataset Splits | Yes | For all baselines, we use 80% of the data for training and 20% for validation. |
| Hardware Specification | Yes | We run all of our experiments on a server with two 16-core Intel Xeon Gold 6242 processors each with 2.8GHz and 192GB of RAM. |
| Software Dependencies | Yes | We provide a publicly available implementation of IOPL in Python. For solving the LP (linear program) relaxations and the pricing problem, we use Gurobi 9.0. |
| Experiment Setup | Yes | The hyperparameters are given in Table 2. Table 2: Hyperparameter Grids (e.g., 'initial learning rate {10^-1, 10^-2, 10^-3, 10^-4}', 'batch size {128, full}', 'regularization parameter ρ {10^-2, 10^-3, 10^-4}') |