Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Large Pre-trained time series models for cross-domain Time series analysis tasks
Authors: Harshavardhan Prabhakar Kamarthi, B. Aditya Prakash
NeurIPS 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate LPTM on downstream forecasting and classification tasks from multiple domains and observe that LPTM consistently provides performance similar to or better than previous state-of-art models usually under zero-shot evaluation as well as when fine-tuned with lesser training data and compute time. Overall, we also observe that LPTM typically requires less than 80% of training data used by state-of-art baselines to provide similar or better performance. |
| Researcher Affiliation | Academia | Harshavardhan Kamarthi College of Computing Georgia Institute of Technology EMAIL B. Aditya Prakash College of Computing Georgia Institute of Technology EMAIL |
| Pseudocode | Yes | Algorithm 1: Adaptive Segmentation Module |
| Open Source Code | Yes | The code for implementation of LPTM and datasets are provided at anonymized link3 and hyperparameters are discussed in the Appendix. |
| Open Datasets | Yes | Epidemics: We use a large number of epidemic time-series aggregated by Project Tycho (van Panhuis et al., 2018)... Electricity: We use ETT electricity datasets (ETT1 and ETT2) collected from (Zhou et al., 2021)... Traffic Datasets: We use 2 datasets related to traffic speed prediction. PEMS-Bays (PEM-B) and METR-LA (Li et al., 2017)... M4 competition time-series: We also used the 3003 time-series of M4 forecasting competition (Makridakis and Hibon, 2000)... Motion and behavioral sensor datasets: We use the set of sensor datasets extracted from UEA archive (Bagnall et al., 2018) and UCI Machine learning repository (Asuncion and Newman, 2007). |
| Dataset Splits | Yes | We use the default 12/4/4 train/val/test split and use the train split for pre-training as well. We use an 80-20 train-test split similar to Chowdhury et al. (2022). |
| Hardware Specification | Yes | The model is run on Intel Xeon CPU with 64 cores and 128 GB RAM. We use a single A100 GPU with 80GB memory. |
| Software Dependencies | No | The paper mentions software components like GRU, Transformer, and Adam optimizer but does not provide specific version numbers for these or other software dependencies like Python, PyTorch, or CUDA. |
| Experiment Setup | Yes | For GRU we use a single hidden layer of 50 hidden units. Dimension of v is also 50. The transformer architecture consists of 10 layers with 8 attention heads each. For both pre-training and fine-tuning, we used the Adam optimizer with a learning rate of 0.001. For RANDMASK, we found the optimal γ = 0.4, and for LASTMASK γ = 0.2 was optimal. |