Performance Guarantees for Homomorphisms beyond Markov Decision Processes

Authors: Sultan Javed Majeed, Marcus Hutter7659-7666

AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical In this work, we use similar notation and techniques of Hutter (2016) but investigate and prove optimality bounds for non-MDP state-action homomorphisms in GRL.
Researcher Affiliation Academia 1,2Research School of Computer Science, Australian National University, Australia
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code No The paper does not provide concrete access to source code for the methodology described.
Open Datasets No The paper uses conceptual examples like 'Navigational Grid-world' for illustration but does not provide access information for any publicly available or open dataset used in empirical studies.
Dataset Splits No The paper is theoretical and does not provide specific dataset split information (percentages, sample counts, or citations to predefined splits) needed to reproduce data partitioning for experiments.
Hardware Specification No The paper is theoretical and does not provide specific hardware details used for running experiments.
Software Dependencies No The paper does not provide specific ancillary software details with version numbers needed to replicate the experiment.
Experiment Setup No The paper does not contain specific experimental setup details, such as concrete hyperparameter values or training configurations, for a reproducible experiment. It mentions 'Value Iteration (VI) (Bellman 1957) with some fixed but irrelevant parameters on the grid world' but this is within a motivational example, not a detailed experimental setup.