Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Optimistic Online-to-Batch Conversions for Accelerated Convergence and Universality

Authors: Yu-Hu Yan, Peng Zhao, Zhi-Hua Zhou

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section, we conduct numerical experiments to validate the effectiveness of our proposed methods. We evaluate our methods in the squared loss minimization and logistic regression tasks across multiple LIBSVM datasets under both non-universal and universal settings, comparing against classic methods including NAG, gradient descent, Uni XGrad [Kavis et al., 2019], and the method in Joulani et al. [2020b]. The results demonstrate that our method achieves comparable or superior convergence performance while maintaining competitive computational efficiency. Detailed setup descriptions and experimental results can be found in Appendix E. [...] Figure 2 plots the suboptimality gap and time complexity of all methods, in the non-universal setting. [...] Figure 3 plots the suboptimality gap and time complexity of all methods, in the universal setting.
Researcher Affiliation Academia Yu-Hu Yan, Peng Zhao, Zhi-Hua Zhou National Key Laboratory for Novel Software Technology, Nanjing University, China School of Artificial Intelligence, Nanjing University, China EMAIL
Pseudocode Yes Algorithm 1 Vanilla/Stabilized Online-to-Batch Conversion
Open Source Code No The paper does not provide an explicit statement or link for open-source code for the methodology it describes. The NeurIPS checklist states "Our paper does not include experiments requiring code" and "Our paper does not release new assets."
Open Datasets Yes We evaluate our methods in the squared loss minimization and logistic regression tasks across multiple LIBSVM datasets under both non-universal and universal settings
Dataset Splits No For the squared loss task, we take the least squares problem with L2-norm ball constraint for this setting, i.e., f(x) 1 2N Ax b 2 2, where x 2 < R, A RN d follows a normal distribution of N(0, σ2I) and b = Ax + ε such that ε is a random vector N(0, 10 3). We pick N = 500 and d = 100. [...] For the logistic regression task, the performance is measured by the ℓ2-regularized logistic loss f(x) 1 N PN i=1 log(1 + exp( bi a i x)) + µ x 2 2, where at Rd and bt { 1, +1} are chosen from a dataset {at, bt}N i=1, µ = 0.005 is the parameter of the regularization term to prevent overfitting. The paper mentions dataset usage but does not provide specific training/test/validation splits.
Hardware Specification No The paper describes the experimental setup in Appendix E, stating 'We report average results of the suboptimality gap, i.e., f( ) minx X f(x), in the logarithmic scale, and time complexity with standard deviations of 5 independent runs. Only the randomness of the initial point is preserved. All hyper-parameters are set to be theoretically optimal.' However, it does not provide any specific details about the hardware used to run these experiments.
Software Dependencies No The paper describes experimental comparisons with various algorithms like NAG, GD, Uni XGrad, and JRGS '20, but it does not specify any software names with version numbers or particular programming languages used for implementation.
Experiment Setup Yes Our experiment setup is mainly inspired by Kavis et al. [2019]. We investigate two kinds of convex smooth optimization problems: the squared loss task and the logistic regression task. For the squared loss task, we take the least squares problem with L2-norm ball constraint for this setting, i.e., f(x) 1 2N Ax b 2 2, where x 2 < R, A RN d follows a normal distribution of N(0, σ2I) and b = Ax + ε such that ε is a random vector N(0, 10 3). We pick N = 500 and d = 100. [...] For the logistic regression task, the performance is measured by the ℓ2-regularized logistic loss f(x) 1 N PN i=1 log(1 + exp( bi a i x)) + µ x 2 2, where at Rd and bt { 1, +1} are chosen from a dataset {at, bt}N i=1, µ = 0.005 is the parameter of the regularization term to prevent overfitting. The smoothness parameter L of the logistic objective is 1 4N λmax(PN i=1 aia i ), where λmax( ) is the largest eigenvalue of the matrix. Here we use five LIBSVM datasets to initialize the logistic loss. In the non-universal setting, all the methods can use the knowledge of the smoothness parameter L, which is prohibited in the universal setting.