Optimizing Local Satisfaction of Long-Run Average Objectives in Markov Decision Processes

Authors: David Klaška, Antonín Kučera, Vojtěch Kůr, Vít Musil, Vojtěch Řehák

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiment I In our first experiment, we aim to analyze the impact of the β, γ coefficients in Comb and the size of available memory on the structure and performance of the resulting strategy σ. We use the graph D of Fig. 1(a) and the objective Distanceν with L2 norm where ν(R) = 4 5 and ν(M) = 1 5. In our FR strategies, we allocate m 4 memory states to the vertex R and one memory state to the vertex M. The coefficients β, γ range over [0, 0.5] with a discrete step 0.1. For every choice of β, γ, and m, we run Local Synt 40 times with Steps set to 800 and return the strategy σ with the least value of Comb found. Then, we use the Local Eval algorithm to compute L-Badnessσ(Distanceν, d) for d {3, . . . , 10}.
Researcher Affiliation Academia Masaryk University, Brno, Czechia david.klaska@mail.muni.cz, tony@fi.muni.cz, vojtech.kur@mail.muni.cz, musil@fi.muni.cz, rehak@fi.muni.cz
Pseudocode Yes Algorithm 1: The core procedure of Local Eval and Algorithm 2: Local Synt
Open Source Code No The paper states 'Our implementation uses PYTORCH framework (Paszke et al. 2019)' but does not provide any specific link or explicit statement about releasing the source code for their own methodology or algorithms described in the paper.
Open Datasets No The paper defines its experimental instances algorithmically (e.g., 'For every n 2, let Dn be a graph with vertices v1, . . . , vn') and uses a specific graph from a figure ('We use the graph D of Fig. 1(a)'). It does not use or provide access information for any publicly available or open datasets in the traditional sense.
Dataset Splits No The paper does not provide specific details on training, validation, and test dataset splits, nor does it mention using cross-validation or predefined splits from cited sources.
Hardware Specification Yes The system setup was as follows: CPU: AMD Ryzen 93900X (12 cores); RAM: 32GB; Ubuntu 20.04.
Software Dependencies No The paper mentions using 'PYTORCH framework' and 'ADAM optimizer' and 'Ubuntu 20.04' but does not provide specific version numbers for the PyTorch or ADAM libraries themselves, which are key software dependencies for reproducibility.
Experiment Setup Yes The coefficients β, γ range over [0, 0.5] with a discrete step 0.1. For every choice of β, γ, and m, we run Local Synt 40 times with Steps set to 800 and return the strategy σ with the least value of Comb found. Then, we use the Local Eval algorithm to compute L-Badnessσ(Distanceν, d) for d {3, . . . , 10}.