Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Non-stationary Online Learning with Memory and Non-stochastic Control

Authors: Peng Zhao, Yu-Hu Yan, Yu-Xiang Wang, Zhi-Hua Zhou

JMLR 2023 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Although our paper mainly focuses on the theoretical investigation, in this section, we further present empirical studies to support our theoretical ﬁndings. We report the results of OCO with memory in Section 6.1 and online non-stochastic control in Section 6.2. [...] Figure 1 plots performance comparisons of three algorithms (OGD, Ader, Scream) under diﬀerent regularizer coeﬃcients. [...] Figure 2 plots the performance comparison of three algorithms (OGD, Ader, Scream) in terms of the cumulative cost. The result shows that our proposed algorithm outperforms the other two contenders, which validates that the meta-base structure (compared with OGD) and the switching-cost-regularizer (compared with Ader) are necessary for online non-stochastic control problems in non-stationary environments.
Researcher Affiliation	Academia	Peng Zhao EMAIL National Key Laboratory for Novel Software Technology Nanjing University, Nanjing 210023, China Yu-Hu Yan EMAIL National Key Laboratory for Novel Software Technology Nanjing University, Nanjing 210023, China Yu-Xiang Wang EMAIL Department of Computer Science University of California, Santa Barbara, CA 93106, USA Zhi-Hua Zhou EMAIL National Key Laboratory for Novel Software Technology Nanjing University, Nanjing 210023, China
Pseudocode	Yes	Algorithm 1 Scream Algorithm 2 Lazy Scream Algorithm 3 Scream.Control Algorithm 4 System Identiﬁcation via Random Inputs (Hazan et al., 2020)
Open Source Code	No	The paper does not contain an unambiguous statement of code release, nor does it provide a link to a code repository. The text "All the proofs are included in appendices." and licensing information do not refer to source code for the methodology.
Open Datasets	No	The paper describes generating data for experiments: "The data item of each round is denoted by (xt, yt) X Y". It also mentions using "synthetic linear dynamical system (LDS) environments and a real inverted pendulum environment". While the inverted pendulum is a known control problem, the paper describes its environment setup rather than referencing a publicly available dataset of pendulum data for download. No specific links, DOIs, repositories, or formal citations are provided for any publicly available datasets.
Dataset Splits	No	The paper describes a simulated online learning scenario where data is generated dynamically: "The data item of each round is denoted by (xt, yt) X Y", and "The underlying model w t will change every 1000 rounds". For the non-stochastic control, it mentions "synthetic linear dynamical system (LDS) environments" and a "real inverted pendulum environment". These are generative or real-time simulation setups, not fixed datasets with explicit training/test/validation splits described.
Hardware Specification	No	The paper describes its experimental settings in Section 6 but does not provide any specific details about the hardware (e.g., GPU/CPU models, memory) used to conduct these experiments.
Software Dependencies	No	The paper describes algorithms and experimental setups but does not list any specific software components (e.g., libraries, frameworks) with version numbers that were used for implementation or experimentation.
Experiment Setup	Yes	The time horizon is set as T = 50000 and the dimension is set as d = 10. [...] The underlying model w t will change every 1000 rounds, randomly sampled from a d-dimensional ball with diameter D/2, so there are in total S = 50 changes. We the squared loss as loss functions, deﬁned as ft(w) = 1/2(w xt yt)2 and thus the gradient is ft(w) = (w xt yt) xt. The feasible set W is also set as d-dimensional ball with diameter D/2, and thus from all above settings, we know that xt 2 Γ, w 2 D/2, and ft(w) 2 DΓ2. We set Γ = 1 and D = 2, so the gradient norm is upper bounded by G = DΓ2 = 2. [...] We set the regularizer coeﬃcient λ = αG, where G is the gradient norm upper bound. We consider three cases with diﬀerent regularizer coeﬃcients that impose diﬀerent levels of penalty on the switching cost: (i) small regularizer (α = 0.1); (ii) medium regularizer (α = 1); (iii) large regularizer (α = 2). We repeat the experiments ﬁve times and report the mean and standard deviation of different algorithms with respect to three performance measures (overall loss, cumulative loss, and switching cost).