Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

DDG-DA: Data Distribution Generation for Predictable Concept Drift Adaptation

Authors: Wendi Li, Xiao Yang, Weiqing Liu, Yingce Xia, Jiang Bian4092-4100

AAAI 2022 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conduct experiments on three realworld tasks (forecasting on stock price trend, electricity load and solar irradiance) and obtain signiﬁcant improvement on multiple widely-used models. Extensive experiments have been conducted on a variety of popular streaming data scenarios with multiple widely-used learning models to evaluate the effectiveness of our proposed method DDG-DA. Experimental results have shown that DDG-DA could enhance the performance of learning tasks in all these scenarios.
Researcher Affiliation	Collaboration	Wendi Li1,2, Xiao Yang2, Weiqing Liu2, Yingce Xia2, Jiang Bian2 1 University of Wisconsin Madison 2 Microsoft Research EMAIL, EMAIL
Pseudocode	No	The paper describes the method using diagrams and mathematical formulations, but does not include structured pseudocode or algorithm blocks.
Open Source Code	Yes	The code of DDG-DA is open-source on Github2. 2https://github.com/Microsoft/qlib/tree/main/examples/benchmarks/dynamic/DDG-DA
Open Datasets	Yes	The experiments are conducted on multiple datasets in three real-world popular scenarios (forecasting on stock price trend, electricity load and solar irradiance (Grinold and Kahn 2000; Pedro, Larson, and Coimbra 2019)).
Dataset Splits	Yes	For each timestamp t, the target of task(t) := (D(t) train, D(t) test) is to learn a new model or adapt an existing model on historical data D(t) train and minimize the loss on D(t) test. These tasks are arranged in chronological order and separated at the beginning of 2016 into Tasktrain (all D(t) test in Tasktrain range from 2011 to 2015) and Tasktest (all D(t) test in Tasktest range from 2016 to 2020). On the validation data, Light GBM (Ke et al. 2017) performs best on average.
Hardware Specification	No	The paper does not provide specific details about the hardware used to run the experiments.
Software Dependencies	No	The paper mentions software like Light GBM, LSTM, and GRU models, but does not provide specific version numbers for these or other software dependencies.
Experiment Setup	Yes	To handle the concept drift in data, we retrain a new model each month (the rolling time interval is 1 month) based on two years of historical data (memory size is limited in an online setting). The time interval of two adjacent tasks is 20 trading days, 7 days and 7 days in stock price trend forecasting, electricity load forecasting and solar irradiance respectively.