Geometric Active Exploration in Markov Decision Processes: the Benefit of Abstraction
Authors: Riccardo De Santi, Federico Arangath Joseph, Noah Liniger, Mirco Mutti, Andreas Krause
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we perform a thorough experimental evaluation of GAE analysing its statistical and computational efficiency on two tasks where the unknown quantity f represents: (1) the amount of pollutant emerging from a point source (see Fig. 1), and (2) the toxicity of chemical compounds generated from a set of base elements (see Fig. 2f). |
| Researcher Affiliation | Academia | 1Department of Computer Science, ETH Zurich, Zurich, Switzerland 2ETH AI Center, Zurich, Switzerland 3Technion, Haifa, Israel. |
| Pseudocode | Yes | Algorithm 1 Geometric Active Exploration (GAE) |
| Open Source Code | No | The paper does not contain an explicit statement about the release of its source code or a link to a code repository. |
| Open Datasets | No | The paper describes the construction of simulated environments and data within the paper (e.g., 'S = 240 states' for pollutant diffusion and 'S = 363 states' for chemical compounds) but does not provide access information (link, DOI, specific citation) for a publicly available dataset. |
| Dataset Splits | No | The paper describes simulated environments and data generation processes, but does not specify training, validation, or test dataset splits (e.g., percentages or sample counts). |
| Hardware Specification | No | The paper mentions computational time was measured but does not specify the exact hardware (CPU, GPU models, or specific machine configurations) used for the experiments. |
| Software Dependencies | No | The paper mentions 'Python' and a 'standard time library in Python' but does not provide specific version numbers for these or any other key software dependencies (e.g., machine learning frameworks, solvers, or additional libraries). |
| Experiment Setup | Yes | The smoothness parameter was chosen to be η = 0.001, and δ = 0.01 for both, deterministic and stochastic dynamics. Furthermore, we found that in practice, a constant number of interactions τk = τ for all the K iterations of GAE works well, especially for remarkably low τ. In this setting, we chose τ = 3... To update the abstract state-action frequency λk+1, we also use a constant update step of 0.005/S. The initial state of the agent was chosen on the outermost circle. n as 210, resulting in K = 70 iterations of GAE. All the experiments were repeated over 15 random seeds. |