IMO^3: Interactive Multi-Objective Off-Policy Optimization
Authors: Nan Wang, Hongning Wang, Maryam Karimzadehgan, Branislav Kveton, Craig Boutilier
IJCAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate its effectiveness empirically on several multi-objective optimization problems. |
| Researcher Affiliation | Collaboration | Nan Wang1 , Hongning Wang1 , Maryam Karimzadehgan2 , Branislav Kveton3 , Craig Boutilier2 1University of Virginia 2Google Research 3Amazon |
| Pseudocode | Yes | Algorithm 1 IMO3 |
| Open Source Code | No | The paper provides a link to an extended version on arXiv (https://arxiv.org/abs/2201.09798), but does not state that source code for the methodology is openly available or provide a direct link to a code repository. |
| Open Datasets | Yes | ZDT1. The ZDT test suite [Zitzler et al., 2000] is the most widely employed benchmark for MOO. We use ZDT1, the first problem in the test suite [...] Yahoo! News Recommendation. This is a news article recommendation problem derived from the Yahoo! Today Module click log dataset (R6A). |
| Dataset Splits | No | The paper mentions generating logged data and using it for off-policy evaluation, but it does not provide specific details on train/validation/test splits, percentages, or sample counts for the datasets used in its experiments. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware (e.g., CPU, GPU models, memory) used to conduct the experiments. |
| Software Dependencies | No | The paper does not list specific software dependencies with version numbers (e.g., programming languages, libraries, frameworks). |
| Experiment Setup | No | The paper describes the multi-objective problems and baselines, and mentions parameters like 'pre-selection budget L = 500' and 'fixed interaction budget T = 100', but it does not specify concrete hyperparameter values for the models, such as learning rates, batch sizes, or optimizer settings used in training or optimization. |