Principal-Agent Reward Shaping in MDPs

Authors: Omer Ben-Porat, Yishay Mansour, Michal Moshkovitz, Boaz Taitler

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We experimentally validate some of our theoretical findings via simulations in the appendix.
Researcher Affiliation Collaboration 1Technion Israel Institute of Technology, Israel 2Tel Aviv University, Israel 3Google Research 4Bosch Center for Artificial Intelligence
Pseudocode Yes Algorithm 1: Stochastic Trees principal-Agent Reward shaping (STAR) Algorithm 2: Deterministic Finite horizon principal-Agent Reward shaping (DFAR)
Open Source Code No The paper states 'We experimentally validate some of our theoretical findings via simulations in the appendix,' which implies code was used, but it does not provide an explicit statement or link for open-source code availability for its methodology.
Open Datasets No The paper is theoretical, focusing on algorithms for MDPs and DDPs, and does not mention specific real-world datasets, nor does it provide concrete access information for any dataset used for training.
Dataset Splits No The paper does not describe experiments using datasets with training, validation, or test splits. Its focus is theoretical analysis and algorithm design for abstract models.
Hardware Specification No The paper does not provide any specific details about the hardware used for running experiments or simulations.
Software Dependencies No The paper does not specify any software dependencies with version numbers needed for reproducibility.
Experiment Setup No The paper describes algorithms and theoretical findings but does not provide specific experimental setup details such as hyperparameters, training configurations, or system-level settings in the main text.