Learning Kernelized Contextual Bandits in a Distributed and Asynchronous Environment
Authors: Chuanhao Li, Huazheng Wang, Mengdi Wang, Hongning Wang
ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | To validate Async-Kernel UCB s effectiveness in reducing communication cost, we performed extensive empirical evaluations on both synthetic and real-world datasets, and reported the results (over 10 runs) in Figure 2. |
| Researcher Affiliation | Academia | Chuanhao Li1 Huazheng Wang2 Mengdi Wang3 Hongning Wang1 1University of Virginia 2Oregon State University 3Princeton University |
| Pseudocode | Yes | Algorithm 1 Asynchronous Kernel UCB (Async-Kernel UCB) |
| Open Source Code | No | The paper does not contain any explicit statements or links indicating that source code for the described methodology is publicly available. |
| Open Datasets | Yes | UCI Machine Learning Repository (Dua & Graff, 2017) (...) Movie Lens consists of 25 million ratings between 160 thousand users and 60 thousand movies (Harper & Konstan, 2015). |
| Dataset Splits | No | The paper does not specify exact percentages or sample counts for training, validation, or test dataset splits. It describes how datasets were partitioned or features extracted, but not the specific data partitioning for model training and evaluation. |
| Hardware Specification | No | The paper does not explicitly describe the specific hardware (e.g., GPU/CPU models, memory) used to run the experiments. |
| Software Dependencies | No | The paper mentions using a 'Gaussian kernel' and 'Sigmoid function' but does not specify any software libraries or their version numbers (e.g., Python, PyTorch, TensorFlow, scikit-learn) used for implementation or experimentation. |
| Experiment Setup | Yes | Synthetic dataset We simulated the distributed bandit setting in Section 3.1, with d = 20, T = 104, N = 102. At each time step t [T], client it [N] selects an arm from candidate set At (with |At| = 20) (...) For all the kernel bandit algorithms, we used the Gaussian kernel k(x, y) = exp( γ x y 2), where we did a grid search of γ {0.1, 1, 4}, and for Fed GLBUCB, we used Sigmoid function µ(z) = (1 + exp( z)) 1 as link function. For all algorithms, instead of using their theoretically derived exploration coefficient α, we followed the convention Li et al. (2010a); Zhou et al. (2020) to use grid search for α in {0.1, 1, 4}. |