reproducibilityindex.ai

An Optimal Private Stochastic-MAB Algorithm based on Optimal Private Stopping Rule

Authors: Touqir Sajed, Or Sheffet

ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We also compare empirically the performance of our algorithm with the private UCB algorithm. (Abstract); our experiments (see Section 5) show a significantly improved performance empirically (Introduction); In this section, we empirically compare the existing DP-UCB algorithm to our DP-SE algorithm (Algorithm 3). (Section 5)
Researcher Affiliation	Academia	1Department of Computing Science, University of Alberta, Edmonton AB, Canada. Correspondence to: Touqir Sajed <touqir@ualberta.ca>, Or Sheffet <osheffet@ualberta.ca>.
Pseudocode	Yes	Algorithm 1 DP-NAS; Algorithm 2 DP exponential NAS; Algorithm 3 DP Successive Elimination
Open Source Code	No	The paper does not contain an explicit statement about releasing the source code for the described methodology or a link to a code repository.
Open Datasets	Yes	In our experiments we set the default values of T = 5 107, ε = 0.25 and K = 5. We assume T is a-priori known to both algorithms and set β = 1/T. Due to space constraints, we only present our empirical results on one instance where the reward means of the arms are linearly spaced in the range [0.25, 0.75]. Namely in our instance, for K = 5, the mean rewards are in {0.75, 0.625, 0.5, 0.375, 0.25}.
Dataset Splits	No	The paper describes the setup for simulating the multi-armed bandit problem with defined parameters for reward distributions. However, it does not specify traditional train/validation/test dataset splits or cross-validation methods as commonly found in supervised learning tasks for data partitioning.
Hardware Specification	No	The paper does not provide any specific details regarding the hardware (e.g., GPU models, CPU types, memory) used to run the experiments.
Software Dependencies	No	The paper does not specify any software dependencies with version numbers (e.g., programming languages, libraries, frameworks) that would be needed to replicate the experiments.
Experiment Setup	Yes	In our experiments we set the default values of T = 5 107, ε = 0.25 and K = 5. We assume T is a-priori known to both algorithms and set β = 1/T. Due to space constraints, we only present our empirical results on one instance where the reward means of the arms are linearly spaced in the range [0.25, 0.75]. Namely in our instance, for K = 5, the mean rewards are in {0.75, 0.625, 0.5, 0.375, 0.25}.