Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Interpretable Off-Policy Learning via Hyperbox Search

Authors: Daniel Tschernutter, Tobias Hatt, Stefan Feuerriegel

ICML 2022 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Using a simulation study, we demonstrate that our algorithm outperforms stateof-the-art methods from interpretable off-policy learning in terms of regret. Using real-word clinical data, we perform a user study with actual clinical experts, who rate our policies as highly interpretable.
Researcher Affiliation Academia 1ETH Zurich, Switzerland 2LMU, Germany.
Pseudocode Yes Algorithm 1 IOPL
Open Source Code Yes We provide a publicly available implementation of IOPL in Python. For solving the LP (linear program) relaxations and the pricing problem, we use Gurobi 9.0. We stop IOPL if it exceeds l = 50 branchand-bound iterations as given in Algorithm 1. We set a maximum time limit of 180 seconds for solving the pricing problem in our experiments. We emphasize that this time limit was never exceeded in our experiments, including the experiments with the real-world clinical data (see Section 4.5.2 for a discussion of the reasons).5Code available at https://github.com/ Daniel Tschernutter/IOPL
Open Datasets Yes We draw upon the AIDS Clinical Trial Group (ACTG) study 175 (Hammer et al., 1996).
Dataset Splits Yes For all baselines, we use 80% of the data for training and 20% for validation.
Hardware Specification Yes We run all of our experiments on a server with two 16-core Intel Xeon Gold 6242 processors each with 2.8GHz and 192GB of RAM.
Software Dependencies Yes We provide a publicly available implementation of IOPL in Python. For solving the LP (linear program) relaxations and the pricing problem, we use Gurobi 9.0.
Experiment Setup Yes The hyperparameters are given in Table 2. Table 2: Hyperparameter Grids (e.g., 'initial learning rate {10^-1, 10^-2, 10^-3, 10^-4}', 'batch size {128, full}', 'regularization parameter ρ {10^-2, 10^-3, 10^-4}')