Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Budgeted-Bandits with Controlled Restarts with Applications in Learning and Computing
Authors: Semih Cayci, Yilin Zheng, Atilla Eryilmaz
TMLR 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Furthermore, through numerical studies, we verified the applicability of our algorithm in the diverse contexts of: (i) algorithm portfolios for SAT solvers; (ii) task scheduling in wireless networks; and (iii) hyperparameter tuning in neural network training. |
| Researcher Affiliation | Collaboration | Semih Cayci EMAIL Department of Mathematics RWTH Aachen University Yilin Zheng EMAIL Google Atilla Eryilmaz EMAIL Department of Electrical and Computer Engineering The Ohio State University |
| Pseudocode | Yes | Algorithm 1: Asymptotically-Optimal Offline Policy πoff Algorithm 2: Online Learning Algorithms for Finite Set of Restart Times UCB-RM (πM) and UCB-RB (πB) Algorithm 3: Online Learning Algorithms for Continuous Set of Restart Times UCB-RC (πC) |
| Open Source Code | No | The paper does not explicitly state that source code for the methodology is provided or include any links to code repositories. |
| Open Datasets | Yes | We evaluated the performance of the meta-algorithms over the widely used Uniform Random-3-SAT benchmark set of satisfiable problem instances in the SATLIB library (Hoos & Stützle, 2000). Specifically, in our setup, we trained a RESNET-16 over the CIFAR-10 dataset |
| Dataset Splits | Yes | Specifically, in our setup, we trained a RESNET-16 over the CIFAR-10 dataset where we used 80 : 20 split for training set and testing set. |
| Hardware Specification | No | The paper mentions "GPU time" as a general resource and "personal computer" for average running time, but does not provide specific hardware details like GPU/CPU models or memory amounts used for experiments. |
| Software Dependencies | No | The paper mentions "SGD optimizer" but does not specify any software libraries or frameworks with version numbers that were used for the implementation. |
| Experiment Setup | Yes | For these experiments, the restart times are finite. Therefore, we used the UCB-RB Algorithm with α = 2.01, (1 + β)2/(1 β) = 1.01. For initialization, the controller performed 40 trials for each (k, tl) decision. The learning rate is set to 0.001 using an SGD optimizer with a batch size of 64. We choose 0.9 as the reward threshold. |