Provably and Practically Efficient Neural Contextual Bandits
Authors: Sudeep Salgia
ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we provide numerical experiments on comparing Neural GCB with several representative baselines, namely, Lin UCB (Chu et al., 2011), Neural UCB (Zhou et al., 2020), Neural TS (Zhang et al., 2021), Sup Neural UCB (Kassraie & Krause, 2022) and Batched Neural UCB (Gu et al., 2021). We perform the empirical studies on three synthetic and two real-world datasets.Figure 1: First, second and third rows correspond to the reward functions h1(x), h2(x) and the Mushroom dataset, respectively. The two leftmost columns show the cumulative regret incurred by the algorithms against number of steps, with σ1 activation functions for the first column and σ2 for the second. The rightmost column compares the regret incurred and the time taken (in seconds) for batched and sequential versions of Neural UCB and Neural GCB. |
| Researcher Affiliation | Collaboration | 1Department of Electrical and Computer Engineering, Cornell University, Ithaca, NY, USA. This is a joint work with Sattar Vakili at Media Tek Research, Cambridge, UK, and Qing Zhao at Cornell University. |
| Pseudocode | Yes | Algorithm 1 Neural GCB", "Algorithm 2 Get Predictions", "Algorithm 4 Train NN(m, L, J, η, λ, W0, {(xi, yi)}n i=1) |
| Open Source Code | No | The paper does not provide concrete access to source code for the methodology described in this paper. No explicit statements about code release or repository links are present. |
| Open Datasets | Yes | We also consider two real datasets for classification namely Mushroom and Statlog (Shuttle), both of which are available on the UCI repository (Dua & Graff, 2017). |
| Dataset Splits | No | The paper describes how subsets of the real datasets were selected (e.g., 'randomly select 1000 points from each class to create a smaller dataset of size 2000'), but it does not specify explicit training, validation, and test dataset splits (e.g., percentages or exact counts for each split). |
| Hardware Specification | Yes | All the experiments were carried out using Python3 on a computer (CPU) with 12 GB RAM and Intel i7 processor (3.4 GHz) with an overall compute time of around 250-300 hours. |
| Software Dependencies | No | The paper mentions that experiments were 'carried out using Python3' but does not specify any other software dependencies, libraries, or frameworks with version numbers that would be necessary for reproduction. |
| Experiment Setup | Yes | For all the experiments, the rewards are generated by adding zero mean Gaussian noise with a standard deviation of 0.1 to the reward function. All the experiments are run for a time horizon of T = 2000. We report the regret averaged over 10 Monte Carlo runs with different random seeds. For all the algorithms, we set the parameter ν to 0.1, and S, the RKHS norm of the reward, to 4 for synthetic functions, and 1 for real datasets. The exploration parameter βt is set to the value prescribed by each algorithm. ... We consider a 2 layered neural net for all the experiments as described in Equation (2). ... For all the experiments, we perform a grid search for λ and η over {0.05, 0.1, 0.5} and {0.001, 0.01, 0.1}, respectively, and choose the best ones for each algorithm. The number of epochs is set to 200 for synthetic datasets and Mushroom, and to 400 for Statlog. For the experiments with sequential algorithms, we retrain the neural nets at every step, including Neural GCB. For Batched Neural UCB, we use a fixed batch size of 10 for synthetic datasets, and 20 for Mushroom. For Neural GCB we set batch size to qr = 5 2r 1 for synthetic datasets, and qr = 5 2r+1 for Mushroom. |