Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Normal Bandits of Unknown Means and Variances

Authors: Wesley Cowan, Junya Honda, Michael N. Katehakis

JMLR 2017 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Remark 3. Numerical Regret Comparison: Figure 1 shows the results of a small simulation study done on a set of six populations with means and variances given in Table 1. It provides plots of the regrets when implementing policies πCHK (the index policy of Eq. (13)), πACF (the index policy of Eq. (3)) , and πG a greedy policy that always activates the bandit with the current highest average. Each policy was implemented over a horizon of 100,000 activations, each replicated 10,000 times to produce a good estimate of the average regret Rπ(n) over the times indicated.
Researcher Affiliation	Academia	Wesley Cowan EMAIL Department of Mathematics Rutgers University ... Junya Honda EMAIL Department of Complexity Science and Engineering Graduate School of Frontier Sciences, The University of Tokyo ... Michael N. Katehakis EMAIL Department of Management Science and Information Systems Rutgers University
Pseudocode	Yes	Policy πACF (UCB1-NORMAL). At each n = 1,2,...: i) Sample from any bandit i for which T i πACF(n) < 8lnn . ii) If T i πACF(n) > 8lnn , for all i = 1,...,N, sample from bandit πACF(n+1) with πACF(n+1) = arg maxi Xi T iπ(n) +4 Si(T i π(n)) lnn T iπ(n)
Open Source Code	No	The paper does not contain any explicit statement about making the source code available, nor does it provide a link to a code repository.
Open Datasets	No	Remark 3. Numerical Regret Comparison: Figure 1 shows the results of a small simulation study done on a set of six populations with means and variances given in Table 1. It provides plots of the regrets when implementing policies πCHK... Each policy was implemented over a horizon of 100,000 activations, each replicated 10,000 times to produce a good estimate of the average regret Rπ(n) over the times indicated.
Dataset Splits	No	The paper uses simulated data generated based on specified normal distributions, not a pre-existing dataset that would require explicit training/test/validation splits.
Hardware Specification	No	The paper does not provide specific details about the hardware used for running the simulations, only describing the simulation methodology itself.
Software Dependencies	No	The paper does not mention any specific software or library names with version numbers that would be needed to replicate the experiments.
Experiment Setup	Yes	Each policy was implemented over a horizon of 100,000 activations, each replicated 10,000 times to produce a good estimate of the average regret Rπ(n) over the times indicated. The simulation study was done on a set of six populations with means and variances given in Table 1.