Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Near Optimal Non-asymptotic Sample Complexity of 1-Identification
Authors: Zitian Li, Wang Chi Cheung
ICML 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We design a new algorithm Sequential-Exploration-Exploitation (SEE), and conduct theoretical analysis from the non-asymptotic perspective. Novel to the literature, we achieve near optimality, in the sense of matching upper and lower bounds on the pulling complexity. The numerical result also indicates the effectiveness of our algorithm, compared to existing benchmarks. We conduct numeric experiments to compare the performance of 1-identification algorithms. The numeric results suggest the excellency of our proposed SEE algorithm and also the weakness of some benchmark algorithms. |
| Researcher Affiliation | Academia | 1Department of Industrial Systems Engineering & Management, National University of Singapore, Singapore. Correspondence to: Zitian Li <EMAIL>, Wang Chi Cheung <EMAIL>. |
| Pseudocode | Yes | Algorithm 1 SEE(Informal) ... Algorithm 2 Sequential-Explore-Exploit(SEE) ... Algorithm 3 SEE-Exploration ... Algorithm 4 SEE-Exploitation ... Algorithm 5 Sampling Rule |
| Open Source Code | No | The paper does not contain any explicit statement about releasing source code or a link to a code repository. |
| Open Datasets | No | We conduct numerical evaluations on SEE and existing benchmark algorithms on synthetic instances. We consider six different groups of reward vectors, which are All Worse , Unique Qualified , One Quarter , Half Good , All Good and Linear . The main difference among these groups is the number of qualified arms. All instances in All Worse group are negative, whose mean rewards of all arms are below µ0. Other groups of instances are positive, meaning that there is at least one arm whose mean reward is > µ0. |
| Dataset Splits | No | We conduct numerical evaluations on SEE and existing benchmark algorithms on synthetic instances... In each experiment setting, we run 1000 independent trials and compute the empirical average of the stopping times. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used to run the experiments. It only discusses the experimental settings and outcomes without mentioning any specific hardware specifications. |
| Software Dependencies | No | The paper does not provide specific software dependencies or version numbers for the tools and libraries used in the implementation. |
| Experiment Setup | Yes | The parameter setting of SEE is δk = 1 3k , βk = 2k/4, αk = 5k, C = 1.01. In all numerical experiments, all the algorithms in section 6 achieve 100% accuracy in identifying a correct answer... We took arm number K = 10, 20, 30, 40, 50, 100, 150, 200 and the tolerance level δ = 0.001, 0.0001. Fix µ0 = 0.5, = 0.15... We set 108 as our forced stopping threshold for each group of experiments... When implementing the algorithm lil HDo C, its originally proposed length of warm-up stage is larger than the total pulling times of some of the benchmark algorithms. To allow comparisons, in our numerical experiment, we only uniformly pull all the arms 200 times for the algorithm lil HDo C. |