Lenient Regret for Multi-Armed Bandits
Authors: Nadav Merlis, Shie Mannor8950-8957
AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we present an empirical evaluation of ϵ-TS. Specifically, we compare ϵ-TS to the vanilla TS on two different gap functions: f( ) = , which leads to the standard regret, and the hinge function f( ) = max{ ϵ, 0}. All evaluations were performed for ϵ = 0.2 over 50, 000 different seeds and are depicted in Figure 3. |
| Researcher Affiliation | Collaboration | 1Technion Institute of Technology, Israel 2Nvidia Research, Israel |
| Pseudocode | Yes | Algorithm 1 ϵ-TS for Bernoulli arms |
| Open Source Code | No | The paper does not contain any explicit statements about making the source code available, nor does it provide links to a code repository. |
| Open Datasets | No | The paper mentions using “Bernoulli rewards” and running simulations “over 50,000 different seeds,” indicating a synthetic data generation process based on a known distribution rather than a specific, named public dataset with access information or a citation. |
| Dataset Splits | No | The paper describes simulation-based evaluations (“over 50,000 different seeds” with “Bernoulli rewards”) and analyzes asymptotic and finite-time behavior, but it does not specify explicit train/validation/test dataset splits or their percentages. |
| Hardware Specification | No | The paper does not provide any specific details regarding the hardware (e.g., GPU models, CPU types, memory specifications) used to run the experiments. |
| Software Dependencies | No | The paper does not specify any software dependencies or their version numbers (e.g., programming languages, libraries, frameworks) used in the experiments. |
| Experiment Setup | Yes | All evaluations were performed for ϵ = 0.2 over 50, 000 different seeds and are depicted in Figure 3. and We tested 4 different scenarios when the optimal arm is smaller or larger than 1 ϵ (left and right columns, respectively), and when the minimal gap is larger or smaller than ϵ (top and bottom rows, respectively). |