Online Markov Decision Processes Configuration with Continuous Decision Space
Authors: Davide Maran, Pierriccardo Olivieri, Francesco Emanuele Stradi, Giuseppe Urso, Nicola Gatti, Marcello Restelli
AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Finally, we compare the empiric performance of our algorithms with a baseline in synthetic experiments. |
| Researcher Affiliation | Academia | Politecnico di Milano, {davide.maran, pierriccardo.olivieri, francescoemanuele.stradi, nicola.gatti, marcello.restelli}@polimi.it, giuseppe.urso@mail.polimi.it |
| Pseudocode | Yes | Algorithm 1: Agent-Configurator Interaction ... Algorithm 2: O-DOSC Algorithm ... Algorithm 3: O-SOSC Algorithm |
| Open Source Code | No | The paper does not provide any explicit statements about releasing source code or links to a code repository for the described methodology. |
| Open Datasets | No | The paper mentions 'synthetic experiments' but does not provide concrete access information (link, DOI, specific citation with author/year) for a publicly available dataset, nor does it specify exact dataset splits for training. |
| Dataset Splits | No | The paper describes 'synthetic experiments' but does not provide specific dataset split information (exact percentages, sample counts, citations to predefined splits, or detailed splitting methodology) for training, validation, or testing. |
| Hardware Specification | No | The paper does not provide specific hardware details (exact GPU/CPU models, processor types, memory amounts, or detailed computer specifications) used for running its experiments. It only mentions 'synthetic experiments'. |
| Software Dependencies | No | The paper does not provide specific ancillary software details, such as library or solver names with version numbers, needed to replicate the experiment. |
| Experiment Setup | No | The paper states 'For reasons of space, the description of the experimental settings and additional details on the experimental results can be found in the Appendix.' and describes the MDP structure used in experiments (four layers, two states, two actions). However, it does not contain specific hyperparameters, training configurations, or system-level settings within the main text. |