reproducibilityindex.ai

Foundation Model-oriented Robustness: Robust Image Model Evaluation with Pretrained Models

Authors: Peiyan Zhang, Haoyang Liu, Chaozhuo Li, Xing Xie, Sunghun Kim, Haohan Wang

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We consider four different scenarios, ranging from the basic benchmark MNIST (Le Cun et al., 1998), through CIFAR10 (Krizhevsky et al., 2009), 9-class Image Net (Santurkar et al., 2019), to full-fledged 1000-class Image Net (Deng et al., 2009).
Researcher Affiliation	Collaboration	Peiyan Zhang1, Haoyang Liu2, Chaozhuo Li3 , Xing Xie3, Sunghun Kim1 and Haohan Wang2 1Hong Kong University of Science and Technology 2University of Illinois at Urbana-Champaign 3Microsoft Research Asia
Pseudocode	Yes	Algorithm 1 Perturbed Image Generation with Foundation Models
Open Source Code	No	The paper provides links to pretrained models and external libraries used (e.g., in Section M), but it does not explicitly state that the source code for its own proposed methodology is open-source or provide a link to it.
Open Datasets	Yes	We consider four different scenarios, ranging from the basic benchmark MNIST (Le Cun et al., 1998), through CIFAR10 (Krizhevsky et al., 2009), 9-class Image Net (Santurkar et al., 2019), to full-fledged 1000-class Image Net (Deng et al., 2009).
Dataset Splits	No	The paper mentions the use of 'validation' in the context of 'Validation Rate (VR)' as a metric for perturbed images and the role of 'ensemble of multiple foundation models to validate the correctness of labels', but it does not provide explicit details about the train/validation/test dataset splits (e.g., percentages or sample counts) for reproducibility.
Hardware Specification	Yes	Our model evaluations are done on 8 NVIDIA V100 GPUs. With our Sparsified VQGAN model, our method is also feasible to work with a small amount of GPU resources. As shown in Appendix I, the proposed protocol can work on a single NVIDIA V100 GPU efficiently.
Software Dependencies	No	The paper mentions several software components and models like VQGAN (Esser et al., 2021), CLIP (Radford et al., 2021), timm library (Wightman, 2019), LASSO, and saga (Defazio et al., 2014) as a solver. However, it does not provide specific version numbers for these software dependencies, which would be necessary for full reproducibility.
Experiment Setup	Yes	The perturbation step size for each iteration is 0.001. The total number of iterations allowed (computation budget B) is 50. For Image Net, we resize all images to 224 224 px. We also center and re-scale the color values with µRGB = [0.485, 0.456, 0.406] and σ = [0.229, 0.224, 0.225].