Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

On Volume Minimization in Conformal Regression

Authors: Batiste Le Bars, Pierre Humbert

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Finally, we evaluate the empirical performance and the robustness of our methodologies. In Section 5, a set of synthetic data experiments illustrates the empirical performance and the robustness of our approaches on asymmetric and heavytailed distributions.
Researcher Affiliation	Academia	1Univ. Lille, Inria, CNRS, Centrale Lille, UMR 9189, CRISt AL, F-59000 Lille 2Sorbonne Université et Univ. Paris Cité, CNRS, LPSM, F-75005 Paris, France.
Pseudocode	Yes	Algorithm 1 Gradient descent to solve the QAE problem (step 1 of Eff Ort and Ad-Eff Ort)
Open Source Code	Yes	Code to run all methods is available at https: //github.com/pierre Hmbt/Ad Eff Ort.
Open Datasets	Yes	We finally compare Ad-Eff Ort with Locally Weighted CP (LW-CP) and CQR on the following public-domain real data sets also considered in e.g. (Romano et al., 2019): abalone (Nash et al., 1994), boston housing (housing) (Harrison Jr & Rubinfeld, 1978)2, and concrete compressive strength (concrete) (Yeh, 1998).
Dataset Splits	Yes	For each scenario, we generate nlrn = ncal = 1000 pairs (Xi, Yi), as well as ntest = 1000 test points to compute the empirical marginal coverage and the average size of the returned set. We randomly split each data set 10 times into a training set, a calibration set and a test set of respective "size" 40%, 40%, and 20%.
Hardware Specification	No	No specific hardware details (GPU/CPU models, memory, etc.) were found in the paper regarding the execution of experiments. The paper focuses on software implementations and experimental setup parameters without specifying the underlying hardware.
Software Dependencies	No	The function ˆs( ) (second step of Ad-Eff Ort) and the two quantile regression functions of CQR are learned by using a Random Forest (RF) quantile regressor, implemented in the Python package sklearn-quantile1. The function ˆσ in LW-CP is learned using the RF regression implementation of scikit-learn (Pedregosa et al., 2011).
Experiment Setup	Yes	The smoothing parameter ε is set to 0.1, niter = 1000, and the step-size sequence is {(1/t)0.6}niter t=1 . using a robust linear regression with Huber loss with parameter δ = 1.35. we set α = 0.1. the max-depth of the RF is set to 5