Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Optimal Rates for Bandit Nonstochastic Control

Authors: Y. Jennifer Sun, Stephen Newman, Elad Hazan

NeurIPS 2023 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical We answer in the affirmative, giving an algorithm for bandit LQR and LQG which attains optimal regret (up to logarithmic factors) for both known and unknown systems.
Researcher Affiliation Collaboration Y. Jennifer Sun Princeton University Google Deep Mind EMAIL Stephen Newman Princeton University EMAIL Elad Hazan Princeton University & Google Deep Mind EMAIL
Pseudocode Yes Algorithm 1 Ellipsoidal BCO with memory (EBCO-M) Algorithm 2 Ellipsoidal Bandit Perturbation Controller (EBPC) Algorithm 3 System estimation via least squares (Sys Est-LS)
Open Source Code No The paper does not include any explicit statement about releasing open-source code or a link to a code repository for the methodology described.
Open Datasets No The paper is theoretical and does not conduct empirical studies using datasets, hence no information on dataset availability or access is provided.
Dataset Splits No The paper is theoretical and does not involve empirical experiments with datasets, thus no information on training/test/validation splits is provided.
Hardware Specification No The paper is theoretical and does not describe an experimental setup with hardware specifications.
Software Dependencies No The paper is theoretical and does not describe an experimental setup with specific software dependencies or version numbers.
Experiment Setup No The paper is theoretical and does not provide details about an experimental setup, such as hyperparameters or training settings.