reproducibilityindex.ai

Bilinear Bandits with Low-rank Structure

Authors: Kwang-Sung Jun, Rebecca Willett, Stephen Wright, Robert Nowak

ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conjecture that the regret bound of ESTR is unimprovable up to polylogarithmic factors, and our preliminary experiment shows that ESTR outperforms a na ıve linear bandit reduction.
Researcher Affiliation	Academia	1Boston University 2University of Chicago 3University of Wisconsin-Madison.
Pseudocode	Yes	Algorithm 1 Low OFUL
Open Source Code	No	The paper mentions using a C implementation wrapped in Python for Opt Space (https://github.com/strin/pyOptSpace), which is a third-party tool, but does not provide a link or statement for their own implementation of ESTR or Low OFUL.
Open Datasets	No	The paper describes a simulation setup for its experiments: 'We draw 16 arms from the unit sphere for each arm set X and Z and simulate the bandit game for T = 104 iterations'. This is a synthetic data generation process, not a publicly available dataset.
Dataset Splits	No	The paper describes a simulation with synthetic data, not a pre-existing dataset with formal train/validation/test splits. It mentions 'tuning T1 by grid search' which implies some form of validation, but no explicit dataset splits are provided.
Hardware Specification	No	The paper does not specify any particular hardware used for running the experiments (e.g., CPU, GPU models, or memory specifications).
Software Dependencies	No	The paper mentions using 'the authors C implementation that is wrapped in python' for Opt Space and refers to the 'Burer-Monteiro formulation', but does not provide specific version numbers for any software dependencies or libraries.
Experiment Setup	Yes	We run our simulation with d1 = d2 = 8, r = 1, σ = 0.01. We set λ = 1 for both OFUL and Low OFUL. For the sake of comparison, we tune c by grid search and report the result with the smallest average regret. For ESTR, the value of T1 used in the proof involves some unknown constants; to account for this, we tune T1 by grid search.