reproducibilityindex.ai

Markov Decision Processes with Time-Varying Geometric Discounting

Authors: Jiarui Gan, Annika Hennes, Rupak Majumdar, Debmalya Mandal, Goran Radanovic

AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	This paper studies a model of infinite-horizon MDPs with time-varying discount factors. We take a game-theoretic perspective whereby each time step is treated as an independent decision maker with their own (fixed) discount factor and we study the subgame perfect equilibrium (SPE) of the resulting game as well as the related algorithmic problems. We present a constructive proof of the existence of an SPE and demonstrate the EXPTIME-hardness of computing an SPE. We also turn to the approximate notion of ϵ-SPE and show that an ϵ-SPE exists under milder assumptions. An algorithm is presented to compute an ϵ-SPE, of which an upper bound of the time complexity, as a function of the convergence property of the time-varying discount factor, is provided.
Researcher Affiliation	Academia	Jiarui Gan1, Annika Hennes2, Rupak Majumdar3, Debmalya Mandal3, Goran Radanovic3 1 University of Oxford 2 Heinrich-Heine-University D usseldorf 3 Max Planck Institute for Software Systems
Pseudocode	Yes	Algorithm 1: Constructing an SPE π = (πt) t=0, given that πt = π for all t T; Algorithm 2: Computing an ϵ-SPE
Open Source Code	No	The paper does not provide concrete access to source code, such as a repository link or an explicit statement about code release for the described methodology.
Open Datasets	No	This paper is theoretical and does not involve the use of datasets for training.
Dataset Splits	No	This paper is theoretical and does not discuss validation datasets or splits.
Hardware Specification	No	This paper is theoretical and does not describe any specific hardware used for running experiments.
Software Dependencies	No	This paper is theoretical and does not list any specific software dependencies with version numbers needed to replicate experimental results.
Experiment Setup	No	This paper is theoretical and does not describe an experimental setup with specific hyperparameters or system-level training settings.