Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Distributed Conformal Prediction via Message Passing

Authors: Haifeng Wen, Hong Xing, Osvaldo Simeone

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Through extensive experiments, we investigate the trade-offs between hyperparameter tuning requirements, communication overhead, coverage guarantees, and prediction set sizes across different network topologies. The code of our work is released on: https://github.com/Haifeng Wen/ Distributed-Conformal-Prediction.
Researcher Affiliation	Academia	1Io T Thrust, The Hong Kong University of Science and Technology (Guangzhou), Guangzhou, China 2Department of ECE, The Hong Kong University of Science and Technology, HK SAR 3Department of Engineering, King s College London, London, U.K..
Pseudocode	Yes	The proposed Q-DCP is summarized in Algorithm 1. ... The proposed H-DCP is summarized in Algorithm 2.
Open Source Code	Yes	The code of our work is released on: https://github.com/Haifeng Wen/ Distributed-Conformal-Prediction.
Open Datasets	Yes	As in Lu et al. (2023), we first train a shared model f( ) using the Cifar100 training data set to generate the score function s( , ). Calibration data, obtained from the Cifar100 test data, is distributed in a non-i.i.d. manner among K = 20 devices... Path MNIST includes 9 classes and 107, 180 data samples in total (89, 996 for training, 10, 004 for validation, 7, 180 for test).
Dataset Splits	Yes	Path MNIST includes 9 classes and 107, 180 data samples in total (89, 996 for training, 10, 004 for validation, 7, 180 for test).
Hardware Specification	No	No specific hardware details (like GPU/CPU models, processor types, or memory amounts) were mentioned in the paper.
Software Dependencies	No	The paper does not provide specific ancillary software details with version numbers.
Experiment Setup	Yes	The hyperparameters for the Q-DCP loss (8) are chosen as follows. We set κ = 2000 for the smooth function g( ) as suggested by Nkansah et al. (2021), and we choose µ = 2000. Moreover, unless noted otherwise, in (8), we set s0 to be the average of the local score quantiles... For H-DCP, unless noted otherwise, we set the consensus rate to η = 1, and the number of quantization levels to M = 1000. We set nk = 50 for all devices k V.