Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Exploring the Noise Robustness of Online Conformal Prediction
Authors: HuaJun Xi, Kangdao Liu, Hao Zeng, Wenguang Sun, Hongxin Wei
NeurIPS 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | To verify the effectiveness of the robust pinball loss, we conduct extensive experiments on CIFAR100 [19] and Image Net [20] with synthetic uniform label noise. In particular, we integrate the proposed loss into ACI with constant [9] and dynamic learning rates [12], and strongly adaptive online conformal prediction [11]. Empirical results show that the robust pinball loss enhances the noise robustness of online conformal prediction by eliminating the coverage gap caused by the label noise. |
| Researcher Affiliation | Academia | Huajun Xi1, Kangdao Liu1,2, Hao Zeng1, Wenguang Sun3, Hongxin Wei1 1Department of Statistics and Data Science, Southern University of Science and Technology 2Department of Computer and Information Science, University of Macau 3Center for Data Science, Zhejiang University Correspondence to: Hongxin Wei <EMAIL> |
| Pseudocode | Yes | Algorithm 1 Noise-Robust Strongly Adaptive Online Conformal Prediction (NR-SAOCP) Algorithm 2 Noise-Robust Scale-Free Online Gradient Descent (NR-SF-OGD) |
| Open Source Code | Yes | We include an example code in supplemental material. |
| Open Datasets | Yes | To verify the effectiveness of the robust pinball loss, we conduct extensive experiments on CIFAR100 [19] and Image Net [20] with synthetic uniform label noise. |
| Dataset Splits | Yes | We use CIFAR-100 [19] and Image Net [20] datasets with synthetic label noise... we train these models for 200 epochs... the coverage gap and the prediction set size are computed over the full test set. |
| Hardware Specification | No | On Image Net, we use four pre-trained classifiers from Torch Vision [30] Res Net18, Res Net50 [31], Dense Net121 [32] and VGG16 [33]. On CIFAR-100, we train these models for 200 epochs using SGD with a momentum of 0.9, a weight decay of 0.0005, and a batch size of 128. |
| Software Dependencies | No | On Image Net, we use four pre-trained classifiers from Torch Vision [30] Res Net18, Res Net50 [31], Dense Net121 [32] and VGG16 [33]. On CIFAR-100, we train these models for 200 epochs using SGD with a momentum of 0.9, a weight decay of 0.0005, and a batch size of 128. |
| Experiment Setup | Yes | The experiments include both constant η = 0.05 and dynamic learning rates ηt = 1/t1/2+ε with ε = 0.1, following prior work [12]). We use CIFAR-100 [19] and Image Net [20] datasets with synthetic label noise. On CIFAR-100, we train these models for 200 epochs using SGD with a momentum of 0.9, a weight decay of 0.0005, and a batch size of 128. We set the initial learning rate as 0.1, and reduce it by a factor of 5 at 60, 120 and 160 epochs. |