Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Kernelized Reinforcement Learning with Order Optimal Regret Bounds
Authors: Sattar Vakili, Julia Olkhovskaya
NeurIPS 2023 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | We prove the first order-optimal regret guarantees under a general setting. Our results show a significant polynomial in the number of episodes improvement over the state of the art. |
| Researcher Affiliation | Collaboration | Sattar Vakili Media Tek Research Cambridge, UK EMAIL Julia Olkhovskaya TU Delft Delft, the Netherlands EMAIL |
| Pseudocode | Yes | A pseudocode is provided in Algorithm 1. |
| Open Source Code | No | The paper does not provide any statement about releasing its source code or a link to a code repository. |
| Open Datasets | No | The paper is theoretical and does not describe conducting experiments with a specific dataset. Therefore, it does not provide access information for a dataset. |
| Dataset Splits | No | The paper is theoretical and does not conduct experiments involving dataset splits for training, validation, or testing. |
| Hardware Specification | No | The paper is theoretical and does not describe any hardware used for experiments. |
| Software Dependencies | No | The paper is theoretical and does not mention any specific software dependencies with version numbers required to replicate experimental results. |
| Experiment Setup | No | The paper is theoretical and does not describe an experimental setup, hyperparameters, or system-level training settings. |