Conformalized Time Series with Semantic Features

Authors: Baiting Chen, Zhimei Ren, Lu Cheng

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on synthetic and benchmark datasets demonstrate that CT-SSF significantly outperforms existing state-of-the-art (SOTA) conformal prediction techniques in terms of prediction efficiency while maintaining a valid coverage guarantee.
Researcher Affiliation Academia Baiting Chen Department of Statistics and Data Science UCLA brantchen@g.ucla.edu Zhimei Ren Department of Statistics and Data Science University of Pennsylvania zren@wharton.upenn.edu Lu Cheng Department of Computer Science University of Illinois Chicago lucheng@uic.edu
Pseudocode Yes Algorithm 1 Conformalized Time Series with Semantic Features
Open Source Code Yes Code and data are available at https://github.com/baiting0522/CT-SSF.
Open Datasets Yes Our experiments encompass evaluations on both synthetic and four real-world benchmark datasets, allowing for assessment under controlled and natural conditions. They are electricity, stock of Amazon, weather, and wind data. The details of these datasets can be found in Appendix B. [18], [59]
Dataset Splits Yes Randomly split the dataset D into training Dtr = {(Xi, Yi)}i Itr and calibration Dca = {(Xi, Yi)}i Ica; During calibration, to get the best value for the number of steps M, we take a subset (e.g., one-fifth) of the calibration set as the additional validation set.
Hardware Specification No The paper states it discusses computer resources in Appendix B, but Appendix B and the main text do not provide specific hardware details such as GPU models, CPU types, or memory specifications used for experiments.
Software Dependencies No The paper mentions deep learning models like RNN and Transformer, but does not specify software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup Yes During calibration, to get the best value for the number of steps M, we take a subset (e.g., one-fifth) of the calibration set as the additional validation set. We calculate the nonconformity score on the rest of the calibration set with various values of step M and then evaluate the validation set to get the best M whose coverage is right above 1 α. Our base model is an 8-layer RNN.