Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
RS-Reg: Probabilistic and Robust Certified Regression through Randomized Smoothing
Authors: Aref Miri Rekavandi, Olga Ohrimenko, Benjamin I. P. Rubinstein
TMLR 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we perform experiments to empirically validate our theoretical results. For synthetic simulations, we present the results for an example function that demonstrates sharp variations in output. We then apply the proposed methods on a camera re-localization task (Rekavandi et al., 2023) based on images. All simulations and experiments were conducted using an Intel(R) Core(TM) i7-9750H CPU running at 2.60GHz (with a base clock speed of 2.59GHz) and 16GB of RAM. |
| Researcher Affiliation | Academia | Aref Miri Rekavandi EMAIL School of Computing and Information Systems The University of Melbourne Olga Ohrimenko EMAIL School of Computing and Information Systems The University of Melbourne Benjamin I.P. Rubinstein EMAIL School of Computing and Information Systems The University of Melbourne |
| Pseudocode | No | The paper describes theoretical proofs and algorithms through mathematical equations and textual descriptions, but does not contain any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | The code is publicly available at https://github.com/arekavandi/Certified_Robust_Regression. |
| Open Datasets | Yes | DSAC (Brachmann & Rother, 2022) is a popular technique and adopted in this paper for robustness analysis... We used a threshold of ϵy = 5m for defining the accepted region in the output, with (U = 85 and L = 15) with β = 2 and P = 80%. we investigate the range of r [0, 0.1] for the scene where the image dimension was 480 854 pixels. |
| Dataset Splits | Yes | For the Great Court Scene, out of 760 test images, 120 randomly selected images were used (due to the similarity of the images and to reduce the required runtime) to report the certified error rate defined above. |
| Hardware Specification | Yes | All simulations and experiments were conducted using an Intel(R) Core(TM) i7-9750H CPU running at 2.60GHz (with a base clock speed of 2.59GHz) and 16GB of RAM. |
| Software Dependencies | No | The paper does not explicitly mention any specific software dependencies with version numbers. |
| Experiment Setup | Yes | We set σ = 0.23, ϵy = 6 for the ℓ1 output norm, U = 35, L = 0, τ = 0, n = 10K, to ensure that the user-defined probability P = 80% is always less than p A. As n is large, we used the estimated p A as the p A and skipped the use of the Clopper-Pearson lower bound estimator (see Appendix B). We also selected β {1.5, 2} for the discounted certification algorithm. ... For learning of p A using Clopper Pearson (α = 0.5), we used 200 samples and then we used n = 10 for each radius to examine models in the Cambridge Great Court scene in the Cambridge Landmarks dataset (Kendall et al., 2015) using the DSAC pre-trained model. ... We used a threshold of ϵy = 5m for defining the accepted region in the output, with (U = 85 and L = 15) with β = 2 and P = 80%. we investigate the range of r [0, 0.1] for the scene where the image dimension was 480 854 pixels. |