Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Time-o1: Time-Series Forecasting Needs Transformed Label Alignment

Authors: Hao Wang, Licheng Pan, Zhichao Chen, Xu Chen, Qingyang Dai, Lei Wang, Haoxuan Li, Zhouchen Lin

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments demonstrate that Time-o1 achieves state-of-the-art performance and is compatible with various forecast models. To demonstrate the efficacy of Time-o1, there are six aspects empirically investigated: 1. Performance: Does Time-o1 work? We compare Time-o1 with state-of-the-art baselines using public datasets on long-term forecasting in Section 4.2 and short-term forecasting tasks in Appendix E.1. Moreover, we compare Time-o1 with other loss functions in Section 4.3. 2. Gain: How does it work? Section 4.4 offers an ablative study to dissect the contributions of the individual factors of Time-o1, elucidating their roles in enhancing forecast accuracy.
Researcher Affiliation	Collaboration	Hao Wang1 Licheng Pan1 Zhichao Chen2 Xu Chen3 Qingyang Dai4 Lei Wang3 Haoxuan Li5 Zhouchen Lin2,6,7 1Xiaohongshu Inc. 2State Key Lab of General AI, School of Intelligence Science and Technology, Peking University 3Gaoling School of Artificial Intelligence, Renmin University of China 4Department of Control Science and Engineering, Zhejiang University 5Center for Data Science, Peking University 6Institute for Artificial Intelligence, Peking University 7Pazhou Laboratory (Huangpu), Guangzhou, Guangdong, China
Pseudocode	Yes	Algorithm 1 The workflow of Time-o1. Input: ˆY: forecast sequences, Y: label sequences. Parameter: α: the relative weight of the transformed loss, γ: the ratio of retained components. Output: Lα,γ: the obtained loss function. 1: Y standardize(Y). 2: K round(γ T) 3: P SVD(Y; K) 4: Z YP , ˆZ ˆYP 5: Lortho,γ ˆZ Z 1 6: LMSE ˆY Y 2 2 7: Lα,γ := α Lortho,γ + (1 α) LMSE.
Open Source Code	Yes	Code is available at https://github.com/Master-PLC/Time-o1.
Open Datasets	Yes	Datasets. In this work, we conduct experiments on ETT (4 subsets), ECL, Traffic, Weather, and PEMS [28] for long-term forecasting task, and M4 for short-term forecasting task [58]. All datasets are split chronologically into training, validation, and testing sets following their official settings.
Dataset Splits	Yes	All datasets are split chronologically into training, validation, and testing sets following their official settings.
Hardware Specification	Yes	Experiments are conducted on Intel(R) Xeon(R) Platinum 8383C CPUs and NVIDIA RTX H100 GPUs.
Software Dependencies	No	The paper mentions 'Adam [14] optimizer' and references scripts from other papers, but does not provide specific version numbers for general software dependencies such as Python, PyTorch, TensorFlow, or other libraries used in the implementation of Time-o1.
Experiment Setup	Yes	The baseline models are reproduced using the scripts provided by Fredformer [37]. All baseline models are trained using the Adam [14] optimizer to minimize LMSE in (1). Following the prestigious benchmark [38], the dropping-last trick is disabled during the test phase. When integrating Time-o1 to enhance an established model, we adhere to the associated hyperparameter settings in the public benchmark [37, 28], only tuning α, γ and learning rate conservatively. Experiments are conducted on Intel(R) Xeon(R) Platinum 8383C CPUs and NVIDIA RTX H100 GPUs.