Kernel-Based Reinforcement Learning: A Finite-Time Analysis

Authors: Omar Darwiche Domingues, Pierre Menard, Matteo Pirotta, Emilie Kaufmann, Michal Valko

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We empirically validate our approach in continuous MDPs with sparse rewards.
Researcher Affiliation Collaboration 1Inria Lille 2Universit e de Lille 3Otto von Guericke University 4Facebook AI Research, Paris 5CNRS 6Deep Mind Paris.
Pseudocode Yes Algorithm 1 Kernel-UCBVI and Algorithm 2 optimistic Q.
Open Source Code Yes Implementations of Kernel-UCBVI are available on Git Hub, and use the rlberry library (Domingues et al., 2021). The reference provides the link: https: //github.com/rlberry-py/rlberry
Open Datasets No The paper describes a custom Grid-World environment (Section 7) but does not provide concrete access information (link, DOI, specific repository, or formal citation for a public dataset) for it.
Dataset Splits No The paper does not specify explicit training, validation, or test dataset splits or percentages. It operates in an episodic reinforcement learning setting.
Hardware Specification No The paper does not specify any hardware details like GPU/CPU models, memory, or specific computing infrastructure used for the experiments.
Software Dependencies No The paper mentions "rlberry library" but does not provide specific version numbers for it or any other software dependencies.
Experiment Setup Yes We used the Euclidean distance and the Gaussian kernel with a fixed bandwidth σ = 0.025, matching the granularity of the uniform discretization used by some of the baselines. We ran the algorithms for 5 x 10^4 episodes