Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
On Universally Optimal Algorithms for A/B Testing
Authors: Po-An Wang, Kaito Ariu, Alexandre Proutiere
ICML 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We illustrate the performance of the ETT algorithm with α = 1/4 and different thresholds ε, and compare it to that of the uniform sampling algorithm and to that of an Oracle algorithm that selects arms using optimal exploration rate x (µ) = argmaxx g(x, µ). ... The error probabilities are derived from 40000 trials for each setting and algorithm. |
| Researcher Affiliation | Collaboration | 1EECS and Digital Futures, KTH, Stockholm, Sweden 2Cyber Agent, Tokyo, Japan. |
| Pseudocode | Yes | Algorithm 1 Successive Rejects (SR) ... Algorithm 2 Estimate and Thresholded Tracking (ETT) ... Algorithm 3 Randomized TCSF ... Algorithm 4 De-randomized TCSF |
| Open Source Code | No | The paper does not provide an explicit statement or link indicating that the source code for the described methodology is publicly available. |
| Open Datasets | No | We consider the instance: µ = (0.0005, 0.0001). ... The error probabilities are derived from 40000 trials for each setting and algorithm. The paper discusses theoretical properties of bandit instances, not public datasets. |
| Dataset Splits | No | The paper describes simulation trials for specific instances of bandit problems, not experiments on datasets with explicit train/validation/test splits. |
| Hardware Specification | No | The paper mentions numerical experiments but does not provide specific details on the hardware used, such as GPU/CPU models or memory. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers for reproducibility of the experiments. |
| Experiment Setup | Yes | We illustrate the performance of the ETT algorithm with α = 1/4 and different thresholds ε, and compare it to that of the uniform sampling algorithm and to that of an Oracle algorithm that selects arms using optimal exploration rate x (µ) = argmaxx g(x, µ). ... Figure 6 displays the error probability with a fixed budget of T = 20000 and varying ε from 0 to 0.0008. |