Utility Theory for Sequential Decision Making

Authors: Mehran Shakerinava, Siamak Ravanbakhsh

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical We extend these axioms to increasingly structured sequential decision making settings and identify the structure of the corresponding utility functions. In particular, we show that memoryless preferences lead to a utility in the form of a per transition reward and multiplicative factor on the future return. This result motivates a generalization of Markov Decision Processes (MDPs) with this structure on the agent s returns, which we call Affine-Reward MDPs.
Researcher Affiliation Academia 1School of Computer Science, Mc Gill University, Montreal, Canada 2Mila Quebec AI Institute.
Pseudocode No The paper is theoretical and focuses on mathematical proofs and axioms. It does not include any pseudocode or algorithm blocks.
Open Source Code No The paper does not contain any statements about releasing code, nor does it provide links to source code repositories.
Open Datasets No The paper is purely theoretical and does not mention any datasets used for training or evaluation.
Dataset Splits No The paper is theoretical and does not involve experiments or dataset splits for training, validation, or testing.
Hardware Specification No The paper is theoretical and does not describe any experimental setup or the hardware used for computations.
Software Dependencies No The paper is theoretical and does not mention any software dependencies or versions used for experiments.
Experiment Setup No The paper is theoretical and does not describe any experiments, hyperparameters, or training configurations.