Federated Neural Bandits

Authors: Zhongxiang Dai, Yao Shu, Arun Verma, Flint Xiaofeng Fan, Bryan Kian Hsiang Low, Patrick Jaillet

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We prove sub-linear upper bounds on both the cumulative regret and the number of communication rounds of FN-UCB, and empirically demonstrate its competitive performance. Finally, we use both synthetic and real-world contextual bandit experiments to explore the interesting insights about our FN-UCB and demonstrate its effective practical performance (Sec. 5).
Researcher Affiliation Academia Zhongxiang Dai, Yao Shu , Arun Verma, Flint Xiaofeng Fan & Bryan Kian Hsiang Low Department of Computer Science, National University of Singapore {daizhongxiang,shuyao,arun,xiaofeng,lowkh}@comp.nus.edu.sg Patrick Jaillet Department of Electrical Engineering and Computer Science, MIT jaillet@mit.edu
Pseudocode Yes Algorithm 1 FN-UCB (Agent i) and Algorithm 2 Central Server
Open Source Code Yes Our code has been submitted as supplementary material.
Open Datasets Yes We adopt the shuttle and magic telescope datasets from the UCI machine learning repository (Dua & Graff, 2017) and construct the experiments following a widely used protocol in previous works (Li et al., 2010a; Zhang et al., 2021; Zhou et al., 2020). The shuttle dataset is publicly available at https://archive.ics.uci.edu/ml/ datasets/Statlog+(Shuttle) and contains no personally identifiable information or offensive content. The magic telescope dataset is publicly available at https://archive.ics.uci.edu/ml/datasets/magic+gamma+ telescope and contains no personally identifiable information or offensive content.
Dataset Splits No The paper does not explicitly provide training/validation/test dataset splits with specific percentages or sample counts. It describes training models on collected observations and evaluating cumulative regret over time, but not in terms of distinct, pre-defined splits.
Hardware Specification Yes Our experiments are run on a server with 96 CPUs, an NVIDIA A100 GPU with a memory of 40GB, a RAM of 256GB, running the Ubuntu system.
Software Dependencies No The paper mentions 'running the Ubuntu system' but does not specify its version or provide specific version numbers for other ancillary software dependencies like libraries or frameworks.
Experiment Setup Yes for all methods (including our FN-UCB, Neural UCB, and Neural TS), we use the same set of parameters of λ = νT KN = νT K = 0.1 and use an NN with 1 hidden layer and a width of m = 20. As suggested by our theoretical analysis (Sec. 3.3), we select an increasing sequence of α which is linearly increasing (to 1) in the first 700 iterations, and let α = 1 afterwards. Every time we train an NN, we use stochastic gradient descent to train the NN for 30 iterations with a learning rate of 0.01.