Federated Neural Bandits
Authors: Zhongxiang Dai, Yao Shu, Arun Verma, Flint Xiaofeng Fan, Bryan Kian Hsiang Low, Patrick Jaillet
ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We prove sub-linear upper bounds on both the cumulative regret and the number of communication rounds of FN-UCB, and empirically demonstrate its competitive performance. Finally, we use both synthetic and real-world contextual bandit experiments to explore the interesting insights about our FN-UCB and demonstrate its effective practical performance (Sec. 5). |
| Researcher Affiliation | Academia | Zhongxiang Dai, Yao Shu , Arun Verma, Flint Xiaofeng Fan & Bryan Kian Hsiang Low Department of Computer Science, National University of Singapore {daizhongxiang,shuyao,arun,xiaofeng,lowkh}@comp.nus.edu.sg Patrick Jaillet Department of Electrical Engineering and Computer Science, MIT jaillet@mit.edu |
| Pseudocode | Yes | Algorithm 1 FN-UCB (Agent i) and Algorithm 2 Central Server |
| Open Source Code | Yes | Our code has been submitted as supplementary material. |
| Open Datasets | Yes | We adopt the shuttle and magic telescope datasets from the UCI machine learning repository (Dua & Graff, 2017) and construct the experiments following a widely used protocol in previous works (Li et al., 2010a; Zhang et al., 2021; Zhou et al., 2020). The shuttle dataset is publicly available at https://archive.ics.uci.edu/ml/ datasets/Statlog+(Shuttle) and contains no personally identifiable information or offensive content. The magic telescope dataset is publicly available at https://archive.ics.uci.edu/ml/datasets/magic+gamma+ telescope and contains no personally identifiable information or offensive content. |
| Dataset Splits | No | The paper does not explicitly provide training/validation/test dataset splits with specific percentages or sample counts. It describes training models on collected observations and evaluating cumulative regret over time, but not in terms of distinct, pre-defined splits. |
| Hardware Specification | Yes | Our experiments are run on a server with 96 CPUs, an NVIDIA A100 GPU with a memory of 40GB, a RAM of 256GB, running the Ubuntu system. |
| Software Dependencies | No | The paper mentions 'running the Ubuntu system' but does not specify its version or provide specific version numbers for other ancillary software dependencies like libraries or frameworks. |
| Experiment Setup | Yes | for all methods (including our FN-UCB, Neural UCB, and Neural TS), we use the same set of parameters of λ = νT KN = νT K = 0.1 and use an NN with 1 hidden layer and a width of m = 20. As suggested by our theoretical analysis (Sec. 3.3), we select an increasing sequence of α which is linearly increasing (to 1) in the first 700 iterations, and let α = 1 afterwards. Every time we train an NN, we use stochastic gradient descent to train the NN for 30 iterations with a learning rate of 0.01. |