Provably and Practically Efficient Neural Contextual Bandits

Authors: Sudeep Salgia

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section, we provide numerical experiments on comparing Neural GCB with several representative baselines, namely, Lin UCB (Chu et al., 2011), Neural UCB (Zhou et al., 2020), Neural TS (Zhang et al., 2021), Sup Neural UCB (Kassraie & Krause, 2022) and Batched Neural UCB (Gu et al., 2021). We perform the empirical studies on three synthetic and two real-world datasets.Figure 1: First, second and third rows correspond to the reward functions h1(x), h2(x) and the Mushroom dataset, respectively. The two leftmost columns show the cumulative regret incurred by the algorithms against number of steps, with σ1 activation functions for the first column and σ2 for the second. The rightmost column compares the regret incurred and the time taken (in seconds) for batched and sequential versions of Neural UCB and Neural GCB.
Researcher Affiliation Collaboration 1Department of Electrical and Computer Engineering, Cornell University, Ithaca, NY, USA. This is a joint work with Sattar Vakili at Media Tek Research, Cambridge, UK, and Qing Zhao at Cornell University.
Pseudocode Yes Algorithm 1 Neural GCB", "Algorithm 2 Get Predictions", "Algorithm 4 Train NN(m, L, J, η, λ, W0, {(xi, yi)}n i=1)
Open Source Code No The paper does not provide concrete access to source code for the methodology described in this paper. No explicit statements about code release or repository links are present.
Open Datasets Yes We also consider two real datasets for classification namely Mushroom and Statlog (Shuttle), both of which are available on the UCI repository (Dua & Graff, 2017).
Dataset Splits No The paper describes how subsets of the real datasets were selected (e.g., 'randomly select 1000 points from each class to create a smaller dataset of size 2000'), but it does not specify explicit training, validation, and test dataset splits (e.g., percentages or exact counts for each split).
Hardware Specification Yes All the experiments were carried out using Python3 on a computer (CPU) with 12 GB RAM and Intel i7 processor (3.4 GHz) with an overall compute time of around 250-300 hours.
Software Dependencies No The paper mentions that experiments were 'carried out using Python3' but does not specify any other software dependencies, libraries, or frameworks with version numbers that would be necessary for reproduction.
Experiment Setup Yes For all the experiments, the rewards are generated by adding zero mean Gaussian noise with a standard deviation of 0.1 to the reward function. All the experiments are run for a time horizon of T = 2000. We report the regret averaged over 10 Monte Carlo runs with different random seeds. For all the algorithms, we set the parameter ν to 0.1, and S, the RKHS norm of the reward, to 4 for synthetic functions, and 1 for real datasets. The exploration parameter βt is set to the value prescribed by each algorithm. ... We consider a 2 layered neural net for all the experiments as described in Equation (2). ... For all the experiments, we perform a grid search for λ and η over {0.05, 0.1, 0.5} and {0.001, 0.01, 0.1}, respectively, and choose the best ones for each algorithm. The number of epochs is set to 200 for synthetic datasets and Mushroom, and to 400 for Statlog. For the experiments with sequential algorithms, we retrain the neural nets at every step, including Neural GCB. For Batched Neural UCB, we use a fixed batch size of 10 for synthetic datasets, and 20 for Mushroom. For Neural GCB we set batch size to qr = 5 2r 1 for synthetic datasets, and qr = 5 2r+1 for Mushroom.