Optimizing Local Satisfaction of Long-Run Average Objectives in Markov Decision Processes
Authors: David Klaška, Antonín Kučera, Vojtěch Kůr, Vít Musil, Vojtěch Řehák
AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiment I In our first experiment, we aim to analyze the impact of the β, γ coefficients in Comb and the size of available memory on the structure and performance of the resulting strategy σ. We use the graph D of Fig. 1(a) and the objective Distanceν with L2 norm where ν(R) = 4 5 and ν(M) = 1 5. In our FR strategies, we allocate m 4 memory states to the vertex R and one memory state to the vertex M. The coefficients β, γ range over [0, 0.5] with a discrete step 0.1. For every choice of β, γ, and m, we run Local Synt 40 times with Steps set to 800 and return the strategy σ with the least value of Comb found. Then, we use the Local Eval algorithm to compute L-Badnessσ(Distanceν, d) for d {3, . . . , 10}. |
| Researcher Affiliation | Academia | Masaryk University, Brno, Czechia david.klaska@mail.muni.cz, tony@fi.muni.cz, vojtech.kur@mail.muni.cz, musil@fi.muni.cz, rehak@fi.muni.cz |
| Pseudocode | Yes | Algorithm 1: The core procedure of Local Eval and Algorithm 2: Local Synt |
| Open Source Code | No | The paper states 'Our implementation uses PYTORCH framework (Paszke et al. 2019)' but does not provide any specific link or explicit statement about releasing the source code for their own methodology or algorithms described in the paper. |
| Open Datasets | No | The paper defines its experimental instances algorithmically (e.g., 'For every n 2, let Dn be a graph with vertices v1, . . . , vn') and uses a specific graph from a figure ('We use the graph D of Fig. 1(a)'). It does not use or provide access information for any publicly available or open datasets in the traditional sense. |
| Dataset Splits | No | The paper does not provide specific details on training, validation, and test dataset splits, nor does it mention using cross-validation or predefined splits from cited sources. |
| Hardware Specification | Yes | The system setup was as follows: CPU: AMD Ryzen 93900X (12 cores); RAM: 32GB; Ubuntu 20.04. |
| Software Dependencies | No | The paper mentions using 'PYTORCH framework' and 'ADAM optimizer' and 'Ubuntu 20.04' but does not provide specific version numbers for the PyTorch or ADAM libraries themselves, which are key software dependencies for reproducibility. |
| Experiment Setup | Yes | The coefficients β, γ range over [0, 0.5] with a discrete step 0.1. For every choice of β, γ, and m, we run Local Synt 40 times with Steps set to 800 and return the strategy σ with the least value of Comb found. Then, we use the Local Eval algorithm to compute L-Badnessσ(Distanceν, d) for d {3, . . . , 10}. |