Multi-View Decision Processes: The Helper-AI Problem

Authors: Christos Dimitrakakis, David C. Parkes, Goran Radanovic, Paul Tylkin

NeurIPS 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In our experiments, we introduce intervention games as a way to construct example scenarios. ... Our results show that the proposed approach provides a large increase in utility in each domain, thus overcoming the deficiencies of P2 s model, when the latter model is known to the AI.
Researcher Affiliation Academia Christos Dimitrakakis David C. Parkes Chalmers University of Technology & University of Lille Harvard University Goran Radanovic Paul Tylkin Harvard University Harvard University
Pseudocode Yes The complete pseudocode is given in Appendix C, algorithm 1.
Open Source Code No No explicit statement or link to open-source code for the described methodology was found.
Open Datasets Yes The food and shelter domain [Guo et al., 2013] involves an agent simultaneously trying to find randomly placed food (in one of the top five locations) while maintaining a shelter.
Dataset Splits No No explicit training/validation/test split percentages or sample counts were provided.
Hardware Specification No No specific hardware details (e.g., GPU/CPU models, memory) used for running experiments were provided.
Software Dependencies No No specific software dependencies with version numbers were provided.
Experiment Setup Yes In all cases, we use a finite time horizon of 100 steps and a discount factor of γ = 0.95. ... When we change the error parameter, we keep the cost parameter constant (0.15 for the multilane highway domain and 0.1 for the food and shelter domain), and vice versa, when we change the cost, we keep the error constant (25 for the multilane highway domain and 0.25 for the food and shelter domain).