Principal-Agent Reward Shaping in MDPs
Authors: Omer Ben-Porat, Yishay Mansour, Michal Moshkovitz, Boaz Taitler
AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We experimentally validate some of our theoretical findings via simulations in the appendix. |
| Researcher Affiliation | Collaboration | 1Technion Israel Institute of Technology, Israel 2Tel Aviv University, Israel 3Google Research 4Bosch Center for Artificial Intelligence |
| Pseudocode | Yes | Algorithm 1: Stochastic Trees principal-Agent Reward shaping (STAR) Algorithm 2: Deterministic Finite horizon principal-Agent Reward shaping (DFAR) |
| Open Source Code | No | The paper states 'We experimentally validate some of our theoretical findings via simulations in the appendix,' which implies code was used, but it does not provide an explicit statement or link for open-source code availability for its methodology. |
| Open Datasets | No | The paper is theoretical, focusing on algorithms for MDPs and DDPs, and does not mention specific real-world datasets, nor does it provide concrete access information for any dataset used for training. |
| Dataset Splits | No | The paper does not describe experiments using datasets with training, validation, or test splits. Its focus is theoretical analysis and algorithm design for abstract models. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware used for running experiments or simulations. |
| Software Dependencies | No | The paper does not specify any software dependencies with version numbers needed for reproducibility. |
| Experiment Setup | No | The paper describes algorithms and theoretical findings but does not provide specific experimental setup details such as hyperparameters, training configurations, or system-level settings in the main text. |