Utility Theory for Sequential Decision Making
Authors: Mehran Shakerinava, Siamak Ravanbakhsh
ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | We extend these axioms to increasingly structured sequential decision making settings and identify the structure of the corresponding utility functions. In particular, we show that memoryless preferences lead to a utility in the form of a per transition reward and multiplicative factor on the future return. This result motivates a generalization of Markov Decision Processes (MDPs) with this structure on the agent s returns, which we call Affine-Reward MDPs. |
| Researcher Affiliation | Academia | 1School of Computer Science, Mc Gill University, Montreal, Canada 2Mila Quebec AI Institute. |
| Pseudocode | No | The paper is theoretical and focuses on mathematical proofs and axioms. It does not include any pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not contain any statements about releasing code, nor does it provide links to source code repositories. |
| Open Datasets | No | The paper is purely theoretical and does not mention any datasets used for training or evaluation. |
| Dataset Splits | No | The paper is theoretical and does not involve experiments or dataset splits for training, validation, or testing. |
| Hardware Specification | No | The paper is theoretical and does not describe any experimental setup or the hardware used for computations. |
| Software Dependencies | No | The paper is theoretical and does not mention any software dependencies or versions used for experiments. |
| Experiment Setup | No | The paper is theoretical and does not describe any experiments, hyperparameters, or training configurations. |