Nonparametric Contextual Bandits in Metric Spaces with Unknown Metric

Authors: Nirandika Wanigasekara, Christina Yu

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We provide regret bounds along with simulations that highlight the algorithm s dependence on the local geometry of the reward functions. We provide simulations that compare our algorithm to oracle variants that have special knowledge of the arms and a naive benchmark that learns over each arm separately.
Researcher Affiliation Academia Nirandika Wanigasekara Computer Science National University of Singapore nirandiw@comp.nus.edu.sg Christina Lee Yu Operations Research and Information Engineering Cornell University cleeyu@cornell.edu
Pseudocode Yes See the appendix for a pseudocode description of the algorithm.
Open Source Code No The paper states 'See the appendix for a pseudocode description of the algorithm.' but does not provide concrete access (e.g., a link to a repository) to the source code for the described methodology, nor an explicit statement about its open-source availability.
Open Datasets No The paper uses a synthetic model to generate data for simulations, describing how contexts and arms are sampled ('context xt U(0, 1)', 'Each arm a corresponds to a parameter θa uniformly spaced out within [0, 1]'). It does not refer to or provide access information for a pre-existing publicly available dataset.
Dataset Splits No The paper describes an online learning setting with trials over a time horizon and a simulation setup, but it does not specify explicit training, validation, and test dataset splits with percentages or sample counts.
Hardware Specification No The paper does not provide specific details about the hardware used for running the experiments (e.g., GPU models, CPU types, or memory specifications).
Software Dependencies No The paper does not provide specific ancillary software details with version numbers (e.g., library or solver names with version numbers).
Experiment Setup Yes For all algorithms the flagging rule is set to nt(ρ) 4 ln(T)/ 2, and σ was set to 1e 2. For Approx-Zooming , k was set to 10. We set the number of trials T to 100, 000 as all the algorithms had converged to their optimal point by then. Additional details on how the model parameters were chosen is given in Appendix F.