reproducibilityindex.ai

Cogra: Concept-Drift-Aware Stochastic Gradient Descent for Time-Series Forecasting

Authors: Kohei Miyaguchi, Hiroshi Kajino4594-4601

AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	As a result of comprehensive experiments, we ﬁnd that (i) our SMT can estimate the mean better than v SGD s estimator in the presence of concept drift, and (ii) in terms of predictive performance, Cogra reduces the predictive loss by 16 67% for real-world datasets, indicating that SMT improves the prediction accuracy signiﬁcantly.The effectiveness of our method is empirically validated by extensive simulations. In speciﬁc, we design two experiments to answer the following questions: (Q1) Does SMT estimate the moments better than the estimator used in v SGD? (Q2) Does SMT improve the predictive performance? (Q3) When does Cogra outperform the other SGDs?The ﬁrst experiment, answering (Q1), evaluates the estimation error of a moment on synthetic data. The result shows that SMT decreases the error by 60% in total, measured by squared loss, as compared to v SGD, answering (Q1) in the afﬁrmative. The second one, answering (Q2) and (Q3), evaluates the predictive performances on both synthetic and real-world data.
Researcher Affiliation	Collaboration	Kohei Miyaguchi The University of Tokyo, Tokyo, Japan kohei miyaguchi@mist.i.u-tokyo.ac.jp Hiroshi Kajino IBM Research Tokyo, Tokyo, Japan kajino@jp.ibm.com
Pseudocode	Yes	Algorithm 1 Sequential Mean Tracker, SMT Algorithm 2 Cogra algorithm
Open Source Code	No	The paper does not provide any explicit statement about releasing source code or a link to a code repository for the described methodology.
Open Datasets	Yes	We employ VAR(3) as a predictive model and use three datasets from the UCI repository (Lichman 2013). The activity recognition dataset records three dimensional acceleration data (Casale, Pujol, and Radeva 2012). The gas sensor array dataset (Fonollosa et al. 2015) collects the recordings of 18 chemical sensors exposed to the gas mixture with dynamically-varying concentrations.
Dataset Splits	No	The paper describes using synthetic and real-world datasets and an online learning procedure where the model predicts the next data point, but it does not specify explicit training, validation, or test dataset splits (e.g., percentages, sample counts, or predefined split references).
Hardware Specification	No	The paper does not provide specific details about the hardware used to run the experiments, such as GPU or CPU models, memory, or cloud computing instance types.
Software Dependencies	No	The paper does not provide specific version numbers for any software dependencies, libraries, or programming languages used in the experiments.
Experiment Setup	Yes	For Ada Grad, ADAM, and RMSProp, we employ multiple initial learning rates, ﬁxing the other hyperparameters to be as recommended as in the original papers. The initial learning rates are searched over {10 x}3 x=0. For RMSProp, we add 10 4 in the real-world experiments so as to show that the best rate resides inside the search space, not on its boundary. Almeida requires careful tuning of the initial learning rate and the hyper learning rate. We ﬁx the hyper-learning rate as 10 3, 10 2, and 10 1, and we search the initial learning rate so that the model parameters do not diverge. Predictive Model. We employ vector autoregression, VAR(p) (L utkepohl 2005).