Reinforcement Learning in Configurable Continuous Environments

Authors: Alberto Maria Metelli, Emanuele Ghelfi, Marcello Restelli

ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section, we provide the experimental evaluation of REMPS on three domains: a simple chain domain (Section 6.1, Figure 1), the classical Cartpole (Section 6.2) and a more challenging car-configuration task based on TORCS (Section 6.3).
Researcher Affiliation Academia 1Politecnico di Milano, 32, Piazza Leonardo da Vinci, Milan, Italy.
Pseudocode Yes Algorithm 1 Relative Entropy Model Policy Search
Open Source Code No The paper does not provide explicit statements about the release of its own source code or links to a code repository for the described methodology.
Open Datasets No The paper mentions benchmark environments like Cartpole and TORCS, which are generally known, but does not provide specific access information (links, DOIs, repository names, or formal citations for specific dataset instances) for the training data or collected datasets used in their experiments.
Dataset Splits No The paper does not provide specific details on dataset splits (e.g., percentages or counts for training, validation, and test sets) or cross-validation setup.
Hardware Specification No The paper does not provide any specific details about the hardware (e.g., GPU/CPU models, memory, or cloud instances) used for running the experiments.
Software Dependencies No The paper mentions software like TORCS but does not provide specific version numbers for any ancillary software, libraries, or dependencies used in the experiments.
Experiment Setup Yes Hyperparameter values and further experiments, including the effect of the different projection strategies, no-configuration cases, and the comparison with SPMI (Metelli et al., 2018), are reported in Appendix D.1.