Algorithms for Differentially Private Multi-Armed Bandits
Authors: Aristide Tossou, Christos Dimitrakakis
AAAI 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 4 Experiments We perform experiments using arms with rewards drawn from independent Bernoulli distribution. The plot, in logarithmic scale, shows the regret of the algorithms over 100, 000 time steps averaged over 100 runs. We targeted 2 different ϵ privacy levels : 0.1 and 1. |
| Researcher Affiliation | Academia | Aristide C. Y. Tossou and Christos Dimitrakakis Chalmers University of Technology, Gothenburg, Sweden {aristide, chrdimi}@chalmers.se |
| Pseudocode | Yes | Algorithm 1 DP-UCB-BOUND Input ϵ, the differential privacy parameter. Instantiate K Hybrid Mechanisms (Chan, Shi, and Song 2010); one for each arm a. for t 1 to T do if t K then play arm a = t and observe the reward rt Insert rt to the hybrid mechanism a else for all arm a do sa(t) total sum computed using the hybrid mechanism a n a,t na,t 2 log2 na,t (Remove the closest power of 2 from na,t) νa 4 8 ϵ log t (log2 n a,t + 1) end for Pull arm at = arg maxa sa(t) na,t + νa na,t Observe the reward rt Insert rt to the hybrid mechanism for arm at end if end for |
| Open Source Code | No | The paper does not provide any statement or link indicating that the source code for the methodology is openly available. |
| Open Datasets | No | We perform experiments using arms with rewards drawn from independent Bernoulli distribution. The paper describes generating data for the experiments rather than using a pre-existing public dataset with explicit access information. |
| Dataset Splits | No | The paper describes running experiments for a certain number of time steps over multiple runs, but it does not specify train/validation/test splits, which are typically used for traditional supervised learning datasets. |
| Hardware Specification | No | The paper does not provide any details about the specific hardware (e.g., GPU, CPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper does not list any specific software dependencies with version numbers that would be required to reproduce the experiments. |
| Experiment Setup | Yes | We targeted 2 different ϵ privacy levels : 0.1 and 1. For DP-UCB-INT, we pick ϵ according to corollary 3.2 such that the overall privacy is (ϵ, δ)-DP with δ = e 10, ϵ {0.1, 1}, v = 1.1. We compared against the non private UCB algorithm and the algorithm presented in (Mishra and Thakurta 2015) (Private-UCB ) with a failure probability chosen to be t 4. We perform two scenarios. Firstly we used two arms: one with expectation 0.9 and the other 0.6. The second scenario is a more challenging one with 10 arms having all an expectation of 0.1 except two with 0.55 and 0.2. |