reproducibilityindex.ai

Learning to Influence Human Behavior with Offline Reinforcement Learning

Authors: Joey Hong, Sergey Levine, Anca Dragan

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We demonstrate that offline RL can solve two challenges with effective influence. First, we show that by learning from a dataset of suboptimal human-human interaction on a variety of tasks none of which contains examples of successful influence an agent can learn influence strategies to steer humans towards better performance even on new tasks. Second, we show that by also modeling and conditioning on human behavior, offline RL can learn to affect not just the human s actions but also their underlying strategy, and adapt to changes in their strategy.
Researcher Affiliation	Academia	Joey Hong Sergey Levine Anca Dragan UC Berkeley {joey hong,sergey.levine,anca}@berkeley.edu
Pseudocode	No	The paper describes algorithmic approaches and modifications to CQL using text and mathematical equations, but it does not contain any clearly labeled 'Pseudocode' or 'Algorithm' blocks with structured steps.
Open Source Code	No	The paper does not contain any statement about releasing source code or provide a link to a code repository.
Open Datasets	No	We collected a dataset of human-human play where the human players were provided with one of several different instructions, in order to gather a diverse dataset that illustrates a variety of behaviors and human-human interactions. (No access information provided for this collected dataset).
Dataset Splits	No	The paper mentions data collection sizes (e.g., '20 human-human trajectories of length H = 1, 200', '30 trajectories of length H = 400') for evaluation, but it does not specify explicit train/validation/test dataset splits, percentages, or methods for partitioning the data.
Hardware Specification	No	The paper does not provide any specific hardware details such as GPU or CPU models, memory specifications, or cloud/cluster resources used for running the experiments.
Software Dependencies	No	The paper refers to specific algorithms like CQL [18] and mentions neural networks, but it does not provide specific version numbers for any software components, libraries, or dependencies (e.g., Python, PyTorch, TensorFlow).
Experiment Setup	No	The paper states: 'We defer implementation details, i.e., architecture and hyperparameter choices, to Appendix A.' and 'We describe the high-level approach but defer implementation details to Appendix A.' Since Appendix A is not provided in the main text, specific experimental setup details are not present.