Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Exploring Geometry of Blind Spots in Vision models

Authors: Sriram Balasubramanian, Gaurang Sriramanan, Vinu Sankar Sadasivan, Soheil Feizi

NeurIPS 2023 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this paper, we primarily consider standard vision datasets such as Image Net [Deng et al., 2009] and CIFAR-10 [Krizhevsky et al., 2009]... We present the image quality metrics for blindspots discovered by LST in Table 1... Here we perform ablation studies on LST hyperparameters to study the impact of each one independently.
Researcher Affiliation	Academia	Sriram Balasubramanian EMAIL Gaurang Sriramanan EMAIL Vinu Sankar Sadasivan EMAIL Soheil Feizi EMAIL Department of Computer Science University of Maryland, College Park
Pseudocode	Yes	Algorithm 1 Level Set Traversal (LST) 1: Input: Source image xs with label y, target image xt, model f, max iterations m, scale factor η, stepsize ϵ, confidence threshold δ 2: Initialize x = xs, x\|\| = 0 3: for i = 1 to m do 4: x = xt x 5: g = x CE(f(x), y) 6: c// = (g x)/\|\|g\|\|2 7: x = η( x c//g) 8: x\|\| = Π (x\|\| ϵg, ϵ, ϵ) 9: xnew = x + x + x\|\| 10: if f(xs)[j] f(xnew)[j] > δ then 11: return x 12: x = xnew 13: return x
Open Source Code	Yes	The code for this project is publicly available at this URL.
Open Datasets	Yes	In this paper, we primarily consider standard vision datasets such as Image Net [Deng et al., 2009] and CIFAR-10 [Krizhevsky et al., 2009] (latter in Section C of the Appendix).
Dataset Splits	No	In this paper, we present results on vision datasets such as Image Net [Deng et al., 2009] and CIFAR10 [Krizhevsky et al., 2009], given that they have come to serve as benchmark datasets in the field... To calculate these metrics, we sample around 1000 source images from Image Net, and select five other random target images of different classes for each source image.
Hardware Specification	Yes	We record wall clock time on a single RTXA5000 GPU with a Res Net-50 model on Image Net, using a batchsize of 100, and report mean and standard deviation (µ σ) statistics over 5 independent minibatches.
Software Dependencies	No	In this paper, all training and experimental evaluations were performed using Pytorch [Paszke et al., 2019].
Experiment Setup	Yes	Image Net : In the main paper, we fix the parameters of the LST algorithm for all the visualizations (Fig 3,6,7 and Tables 1,2 in the Main paper). The scale factor for the step perpendicular to the gradient, or η, is 10^-2. The stepsize for the perturbation parallel to the gradient CE(f(x), y), or ϵ, is 2e-3. The confidence threshold (δ) is 0.2, which means that the confidence never drops below the confidence of the source image by more than 0.2. In practice, we rarely observe such significant drops in the confidence during the level set traversal. The algorithm is run for m = 400 iterations.