reproducibilityindex.ai

Adaptive Hierarchical Certification for Segmentation using Randomized Smoothing

Authors: Alaa Anani, Tobias Lorenz, Bernt Schiele, Mario Fritz

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our extensive experiments on the datasets Cityscapes, PASCAL-Context, ACDC and COCO-Stuff demonstrate that our adaptive algorithm achieves a higher CIG and lower abstain rate compared to the current state-of-the-art certification method.
Researcher Affiliation	Academia	1CISPA Helmhotz Center for Information Security, Saarbr ucken, Germany 2Max Planck Institute for Informatics, Saarland Informatics Campus, Saarbr ucken, Germany.
Pseudocode	Yes	These steps are outlined in Algorithm 1 describing GETCOMPONENTLEVELS... Algorithm 2 HSAMPLE... Algorithm 3 ADAPTIVECERTIFY... Algorithm 4 SAMPLEPOSTERIORS... Algorithm 5 HYPOTHESESTESTING.
Open Source Code	Yes	Our code can be found here: https://github.com/Alaa Anani/adaptive-certify.
Open Datasets	Yes	We use four segmentation datasets: Cityscapes (Cordts et al., 2016), the Adverse Conditions Dataset with Correspondences (ACDC) (Sakaridis et al., 2021), PASCAL-Context (Mottaghi et al., 2014) and COCO-Stuff-10K (Caesar et al., 2018)
Dataset Splits	Yes	The final model performance on the clean split has a mean per-pixel accuracy of 62.77% and an m Io U of 0.3146. Meanwhile, on the noisy validation split, the mean per-pixel accuracy is 53.43% and the m Io U is 0.2436. We validate twice every epoch on both the clean and noisy (σ = 0.25) validation splits.
Hardware Specification	No	The paper mentions training was done 'per GPU' but does not specify any particular GPU models, CPU models, memory amounts, or other specific hardware configurations used for the experiments.
Software Dependencies	No	The paper mentions 'PyTorch' and 'Paddle Cls' as software used, but it does not provide specific version numbers for these or any other ancillary software components, which is required for reproducibility.
Experiment Setup	Yes	Unless stated otherwise, all certification results use the values σ = 0.25, τ = 0.75 and α = 0.001 in both our algorithm ADAPTIVECERTIFY and the baseline SEGCERTIFY. We use different parameters for the threshold function Tthresh (Eq 4) for ADAPTIVECERTIFY per dataset as described in App. B.3, which we found via a grid search that maximizes the CIG metric. The batch sizes used for training and validation were 12 and 1 respectively, per GPU.