Incentivized Truthful Communication for Federated Bandits
Authors: Zhepei Wei, Chuanhao Li, Tianze Ren, Haifeng Xu, Hongning Wang
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive numerical studies further validate the effectiveness of our proposed solution. |
| Researcher Affiliation | Academia | University of Virginia University of Chicago |
| Pseudocode | Yes | Algorithm 1 Truthful Incentive Search |
| Open Source Code | No | The paper does not provide any statements about releasing code or links to a code repository for the methodology described. |
| Open Datasets | No | The paper states 'we create a simulated federated bandit learning environment' but does not provide access information (link, DOI, or formal citation) for any publicly available or open dataset. |
| Dataset Splits | No | The paper uses a 'simulated federated bandit learning environment' and mentions a 'fixed time horizon T', but it does not specify traditional train/validation/test dataset splits (e.g., percentages or sample counts). |
| Hardware Specification | No | The paper mentions running experiments in a 'simulated federated bandit learning environment' but does not specify any hardware details (e.g., GPU models, CPU types, or memory) used for these simulations. |
| Software Dependencies | No | The paper does not list specific software dependencies with version numbers (e.g., Python 3.8, PyTorch 1.9) required for reproducibility. |
| Experiment Setup | Yes | For demonstration purpose, we instantiate it as a combination of client s weighted data collection cost plus its intrinsic preference cost, i.e., f( Vi,t) = w det( Vi,t) + Ci, where w = 10 4, and each client i s intrinsic preference cost Ci is uniformly sampled from U(0, 100). In the simulated environment (Section 5), the time horizon is T = 6250, total number of clients N = 25, context dimension d = 5. We set the hyper-parameter ϵ = 1.0, β = 0.5 in Algorithm 1 and Algorithm 3. The tolerance factor in Algorithm 7 is γ = 1.0. |