MaxGap Bandit: Adaptive Algorithms for Approximate Ranking
Authors: Sumeet Katariya, Ardhendu Tripathy, Robert Nowak
NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct three experiments. First, we verify the validity of our sample complexity bounds in Section 7.1. We then study the performance of our adaptive algorithms on simulated data in Section 7.2, and on the Streetview dataset in Section 7.3. |
| Researcher Affiliation | Collaboration | Sumeet Katariya UW-Madison and Amazon sumeetsk@gmail.com Ardhendu Tripathy UW-Madison astripathy@wisc.edu Robert Nowak UW-Madison rdnowak@wisc.edu |
| Pseudocode | Yes | The paper contains Algorithm 1 Max Gap Elim, Algorithm 2 Max Gap UCB, Algorithm 3 Max Gap Top2UCB, and Algorithm 4 Procedure to find U a(t). |
| Open Source Code | Yes | The code for all experiments is publicly available [19]. [19] Sumeet Katariya, Ardhendu Tripathy, and Robert Nowak. Code for maxgap bandit algorithms and experiments. 2019. URL https://github.com/sumeetsk/maxgap_bandit. |
| Open Datasets | Yes | For our third experiment we study performance on the Streetview dataset [17, 18]... [17] Sumeet Katariya, Lalit Jain, Nandana Sengupta, James Evans, and Robert Nowak. Chicago streetview dataset. 2018. URL https://github.com/sumeetsk/coarse_ranking/. |
| Dataset Splits | No | The paper does not provide explicit training, validation, or test dataset splits (e.g., specific percentages or sample counts) for reproducibility. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., exact GPU/CPU models, memory amounts) used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers (e.g., 'Python 3.8, PyTorch 1.9'). |
| Experiment Setup | Yes | We used a lower bound based stopping condition for Random, Elimination, Top2UCB, and set c = 5 in the UCB stopping condition (value of c chosen empirically as in [13]). The rewards are normally distributed with σ = 0.05. ... each arm is a normal distribution with mean equal to the Borda safety score of the image and standard deviation σ = 0.05. |