Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Achieving Fairness in the Stochastic Multi-Armed Bandit Problem
Authors: Vishakha Patil, Ganesh Ghalme, Vineet Nair, Y. Narahari
JMLR 2021 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conclude by experimentally validating our theoretical results. ... In this section, we show the results of simulations that validate our theoretical findings. ... For the experiments in Figure 2a, we consider a Fair-MAB instance with k = 10, µ1 = 0.8, and µi = µ1 i, where i = 0.01i, and r = (0.05, 0.05, . . . , 0.05) [0, 1]k. We show the results with regret computed over T = 106 time steps. |
| Researcher Affiliation | Academia | Vishakha Patil EMAIL Department of Computer Science and Automation Indian Institute of Science Bangalore, Karnataka, India; Ganesh Ghalme EMAIL Faculty of Industrial Engineering and Management Technion Israel Institute of Technology Haifa, Israel; Vineet Nair EMAIL Faculty of Computer Science Technion Israel Institute of Technology Haifa, Israel; Y. Narahari EMAIL Department of Computer Science and Automation Indian Institute of Science Bangalore, Karnataka, India |
| Pseudocode | Yes | Algorithm 1: Fair-Learn |
| Open Source Code | No | The paper does not contain any explicit statements or links indicating that the source code for the methodology is openly available. |
| Open Datasets | No | For evaluating the performance of our algorithm, we perform experiments on synthetic data sets as this allows for finer control on the tuning of the parameters of the experiment. In particular, we consider the following two Fair-MAB instances: Instance 1: Fairness vs. Regret ... Instance 2: FUCB vs. LFG |
| Dataset Splits | No | The paper uses synthetic datasets described by their parameters (e.g., number of arms, mean rewards, fairness constraints) rather than pre-existing datasets with established splits. Therefore, the concept of training/test/validation splits for existing datasets does not apply here. |
| Hardware Specification | No | The paper does not specify any particular hardware used for running the experiments or simulations. There is no mention of CPU, GPU models, memory, or cloud computing platforms. |
| Software Dependencies | No | The paper mentions algorithms like UCB1, Fair-ucb, Fair-Learn, LFG, and Thompson Sampling. However, it does not specify any software libraries, programming languages with versions, or solvers used for the implementation or simulation of these algorithms. |
| Experiment Setup | Yes | For the experiments in Figure 2a, we consider a Fair-MAB instance with k = 10, µ1 = 0.8, and µi = µ1 i, where i = 0.01i, and r = (0.05, 0.05, . . . , 0.05) [0, 1]k. We show the results with regret computed over T = 106 time steps. ... Next, in Figure 2b, we consider Fair-MAB instance with k = 10, µ = (0.8, 0.75, 0.7, 0.6, 0.5, 0.4, 0.3, 0.2, 0.15, 0.1), and r = (0.05, . . . , 0.05). Here, we show how the cumulative regret varies as α takes different values. ... We consider an instance with k = 3, µ = (0.7, 0.5, 0.4), r = (0.2, 0.3, 0.25) and, α = 0. |