reproducibilityindex.ai

Conformalized Time Series with Semantic Features

Authors: Baiting Chen, Zhimei Ren, Lu Cheng

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments on synthetic and benchmark datasets demonstrate that CT-SSF signiﬁcantly outperforms existing state-of-the-art (SOTA) conformal prediction techniques in terms of prediction efﬁciency while maintaining a valid coverage guarantee.
Researcher Affiliation	Academia	Baiting Chen Department of Statistics and Data Science UCLA brantchen@g.ucla.edu Zhimei Ren Department of Statistics and Data Science University of Pennsylvania zren@wharton.upenn.edu Lu Cheng Department of Computer Science University of Illinois Chicago lucheng@uic.edu
Pseudocode	Yes	Algorithm 1 Conformalized Time Series with Semantic Features
Open Source Code	Yes	Code and data are available at https://github.com/baiting0522/CT-SSF.
Open Datasets	Yes	Our experiments encompass evaluations on both synthetic and four real-world benchmark datasets, allowing for assessment under controlled and natural conditions. They are electricity, stock of Amazon, weather, and wind data. The details of these datasets can be found in Appendix B. [18], [59]
Dataset Splits	Yes	Randomly split the dataset D into training Dtr = {(Xi, Yi)}i Itr and calibration Dca = {(Xi, Yi)}i Ica; During calibration, to get the best value for the number of steps M, we take a subset (e.g., one-ﬁfth) of the calibration set as the additional validation set.
Hardware Specification	No	The paper states it discusses computer resources in Appendix B, but Appendix B and the main text do not provide specific hardware details such as GPU models, CPU types, or memory specifications used for experiments.
Software Dependencies	No	The paper mentions deep learning models like RNN and Transformer, but does not specify software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup	Yes	During calibration, to get the best value for the number of steps M, we take a subset (e.g., one-ﬁfth) of the calibration set as the additional validation set. We calculate the nonconformity score on the rest of the calibration set with various values of step M and then evaluate the validation set to get the best M whose coverage is right above 1 α. Our base model is an 8-layer RNN.