Consistent Aggregation of Objectives with Diverse Time Preferences Requires Non-Markovian Rewards

Authors: Silviu Pitis

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical This paper takes a normative approach to multi-objective agency: from a set of intuitively appealing axioms, it is shown that Markovian aggregation of Markovian reward functions is not possible when the time preference (discount factor) for each objective may vary. It follows that optimal multi-objective agents must admit rewards that are non-Markovian with respect to the individual objectives. To this end, a practical non-Markovian aggregation scheme is proposed, which overcomes the impossibility with only one additional parameter for each objective. This work offers new insights into sequential, multi-objective agency and intertemporal choice, and has practical implications for the design of AI systems deployed to serve multiple generations of principals with varying time preference.
Researcher Affiliation Academia Silviu Pitis University of Toronto and Vector Institute spitis@cs.toronto.edu
Pseudocode No The paper does not contain any pseudocode or algorithm blocks. Its content consists of theoretical derivations, theorems, and discussions.
Open Source Code No The paper does not provide any information about open-source code for the methodology described.
Open Datasets No The paper is theoretical and does not utilize datasets for training or evaluation. The 'numerical example' provided is conceptual, not based on empirical data.
Dataset Splits No The paper is theoretical and does not discuss training, validation, or test data splits.
Hardware Specification No The paper is theoretical and does not describe any hardware specifications used for experiments.
Software Dependencies No The paper is theoretical and does not specify any software dependencies with version numbers.
Experiment Setup No The paper is theoretical and does not describe any empirical experimental setup details, hyperparameters, or training configurations.