Lenient Regret for Multi-Armed Bandits

Authors: Nadav Merlis, Shie Mannor8950-8957

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section, we present an empirical evaluation of ϵ-TS. Specifically, we compare ϵ-TS to the vanilla TS on two different gap functions: f( ) = , which leads to the standard regret, and the hinge function f( ) = max{ ϵ, 0}. All evaluations were performed for ϵ = 0.2 over 50, 000 different seeds and are depicted in Figure 3.
Researcher Affiliation Collaboration 1Technion Institute of Technology, Israel 2Nvidia Research, Israel
Pseudocode Yes Algorithm 1 ϵ-TS for Bernoulli arms
Open Source Code No The paper does not contain any explicit statements about making the source code available, nor does it provide links to a code repository.
Open Datasets No The paper mentions using “Bernoulli rewards” and running simulations “over 50,000 different seeds,” indicating a synthetic data generation process based on a known distribution rather than a specific, named public dataset with access information or a citation.
Dataset Splits No The paper describes simulation-based evaluations (“over 50,000 different seeds” with “Bernoulli rewards”) and analyzes asymptotic and finite-time behavior, but it does not specify explicit train/validation/test dataset splits or their percentages.
Hardware Specification No The paper does not provide any specific details regarding the hardware (e.g., GPU models, CPU types, memory specifications) used to run the experiments.
Software Dependencies No The paper does not specify any software dependencies or their version numbers (e.g., programming languages, libraries, frameworks) used in the experiments.
Experiment Setup Yes All evaluations were performed for ϵ = 0.2 over 50, 000 different seeds and are depicted in Figure 3. and We tested 4 different scenarios when the optimal arm is smaller or larger than 1 ϵ (left and right columns, respectively), and when the minimal gap is larger or smaller than ϵ (top and bottom rows, respectively).