Path-Specific Objectives for Safer Agent Incentives

Authors: Sebastian Farquhar, Ryan Carey, Tom Everitt9529-9538

AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We highlight the opportunities and dangers of these approaches empirically in a content recommendation environment from Krueger, Maharaj, and Leike (2020). Our main contributions are: We formalize the problem of delicate state as a complement to reward specification ( 2); We propose path-specific objectives ( 5); We show this generalizes and unifies prior work ( 6). 7 Experiments We present two experimental tests of our approach in order to elaborate the underlying mathematical mechanisms.
Researcher Affiliation Collaboration Sebastian Farquhar1,2, Ryan Carey1, Tom Everitt2 1University of Oxford, 2Deep Mind
Pseudocode No The paper describes the methods conceptually and mathematically, including definitions and propositions, but it does not include any explicit pseudocode blocks or algorithms.
Open Source Code No The paper does not provide any specific links to source code repositories or state that the code for their methodology is publicly available.
Open Datasets Yes We demonstrate our method using the content recommendation simulation from Krueger, Maharaj, and Leike (2020).
Dataset Splits No The paper refers to a content recommendation simulation from a cited work but does not explicitly provide details on how the data was split into training, validation, and test sets. It mentions 'Number of steps' and 'Batch size' in hyperparameters but no split percentages or counts.
Hardware Specification No The paper does not include any specific details about the hardware used for running the experiments, such as GPU or CPU models.
Software Dependencies No The paper does not provide specific software dependencies with version numbers, such as programming languages, libraries, or frameworks used.
Experiment Setup Yes Table 3: Content Recommendation Hyperparameters. Number of user types (K) 10 Number of article types (M) 10 Number of environments 20 Initialization scale 0.03 Loyalty update rate (α1) 0.03 Preference update rate 0.003 with normalization Architecture 1-layer 100-unit Re LU MLP Optimization algorithm SGD(lr=0.01, ρ = 0.1) Batch size 10 Number of steps 2000 (PBT every 10)