Improved Online Conformal Prediction via Strongly Adaptive Online Learning

Authors: Aadyot Bhatnagar, Huan Wang, Caiming Xiong, Yu Bai

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments show that our methods consistently obtain better coverage and smaller prediction sets than existing methods on real-world tasks, such as time series forecasting and image classification under distribution shift.
Researcher Affiliation Industry 1Salesforce AI Research, Palo Alto, CA, USA.
Pseudocode Yes Algorithm 1 Strongly Adaptive Online Conformal Prediction (SAOCP), adapted from Jun et al. (2017). and Algorithm 2 Scale-Free Online Gradient Descent (SFOGD), adapted from Orabona & P al (2018).
Open Source Code Yes The code for our experiments can be found at https://github.com/salesforce/online_conformal.
Open Datasets Yes We evaluate on four datasets totaling 5111 time series: the hourly (414 time series), daily (4227 time series), and weekly (359 time series) subsets of the M4 Competition, a dataset of time series from many domains including industries, demographics, environment, finance, and transportation (Makridakis et al., 2018); and NN5, a dataset of 111 time series of daily banking data (Ben Taieb et al., 2012).
Dataset Splits No Each time series of length L is split into a training set of length L 120 with 80% for training the base predictor and 20% for initializing the UQ methods, and a test set of length 120 to test the UQ methods. (No explicit validation set split provided.)
Hardware Specification No The paper does not specify the exact hardware (e.g., GPU models, CPU types) used for running the experiments.
Software Dependencies Yes We use their implementations in Merlion v2.0.0 (Bhatnagar et al., 2021).
Experiment Setup Yes Throughout this section we choose the target coverage level to be the standard 1 α = 90%. and To set the maximum radius for SF-OGD and SAOCP, we choose D/3 for each horizon h to be the largest h-step residual observed on the calibration split of the training data. and For Tiny Image Net, we use λ = 0.01 and kreg = 20. For Image Net, we use λ = 0.01 and kreg = 10. and FACI has 4 hyperparameters: the individual expert learning rates γ1, . . . , γN; a target interval length k; and the meta-algorithm learning rate η; and a smoothing parameter σ. We set k = 100 and follow Gibbs & Cand es (2022) to set N = 8, σ = 1 2k, γ = {0.001, 0.002, 0.004, 0.008, 0.016, 0.032, 0.064, 0.128}.