Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Improved Regret of Linear Ensemble Sampling
Authors: Harin Lee, Min-hwan Oh
NeurIPS 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | In this work, we close the fundamental gap of theory and practice by providing an improved regret bound for linear ensemble sampling. We prove that with an ensemble size logarithmic in T, linear ensemble sampling can achieve a frequentist regret bound of e O(d3/2 T), matching state-of-the-art results for randomized linear bandit algorithms, where d and T are the dimension of the parameter and the time horizon respectively. Our approach introduces a general regret analysis framework for linear bandit algorithms. Additionally, we reveal a significant relationship between linear ensemble sampling and Linear Perturbed-History Exploration (Lin PHE), showing that Lin PHE is a special case of linear ensemble sampling when the ensemble size equals T. This insight allows us to derive a new regret bound of e O(d3/2 T) for Lin PHE, independent of the number of arms. Our contributions advance the theoretical foundation of ensemble sampling, bringing its regret bounds in line with the best known bounds for other randomized exploration algorithms. The paper does not include experiments. |
| Researcher Affiliation | Academia | Harin Lee Seoul National University Seoul, South Korea EMAIL Min-hwan Oh Seoul National University Seoul, South Korea EMAIL |
| Pseudocode | Yes | Algorithm 1 Linear Ensemble Sampling and Algorithm 2 Linear Perturbed-History Exploration (Lin PHE) |
| Open Source Code | No | The paper does not include experiments, nor does it provide an explicit statement about releasing code or a link to a code repository. |
| Open Datasets | No | The paper is theoretical and does not involve experiments or datasets, as confirmed by the NeurIPS checklist: 'The paper does not include experiments.' |
| Dataset Splits | No | The paper is theoretical and does not involve experiments or dataset splits, as confirmed by the NeurIPS checklist: 'The paper does not include experiments.' |
| Hardware Specification | No | The paper is theoretical and does not involve experiments requiring hardware specifications, as confirmed by the NeurIPS checklist: 'The paper does not include experiments.' |
| Software Dependencies | No | The paper is theoretical and does not involve experiments requiring software dependencies, as confirmed by the NeurIPS checklist: 'The paper does not include experiments.' |
| Experiment Setup | No | The paper is theoretical and does not involve experiments or an experimental setup, as confirmed by the NeurIPS checklist: 'The paper does not include experiments.' |