Kernelized Reinforcement Learning with Order Optimal Regret Bounds
Authors: Sattar Vakili, Julia Olkhovskaya
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | We prove the first order-optimal regret guarantees under a general setting. Our results show a significant polynomial in the number of episodes improvement over the state of the art. |
| Researcher Affiliation | Collaboration | Sattar Vakili Media Tek Research Cambridge, UK sattar.vakili@mtkresearch.com Julia Olkhovskaya TU Delft Delft, the Netherlands julia.olkhovskaya@gmail.com |
| Pseudocode | Yes | A pseudocode is provided in Algorithm 1. |
| Open Source Code | No | The paper does not provide any statement about releasing its source code or a link to a code repository. |
| Open Datasets | No | The paper is theoretical and does not describe conducting experiments with a specific dataset. Therefore, it does not provide access information for a dataset. |
| Dataset Splits | No | The paper is theoretical and does not conduct experiments involving dataset splits for training, validation, or testing. |
| Hardware Specification | No | The paper is theoretical and does not describe any hardware used for experiments. |
| Software Dependencies | No | The paper is theoretical and does not mention any specific software dependencies with version numbers required to replicate experimental results. |
| Experiment Setup | No | The paper is theoretical and does not describe an experimental setup, hyperparameters, or system-level training settings. |