Does Progress On Object Recognition Benchmarks Improve Generalization on Crowdsourced, Global Data?

Authors: Megan Richards, Polina Kirichenko, Diane Bouchacourt, Mark Ibrahim

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We perform a comprehensive empirical study on two crowdsourced, globally representative datasets, evaluating nearly 100 vision models to uncover several concerning empirical trends
Researcher Affiliation Collaboration Meta AI1, New York University2
Pseudocode No The paper does not contain any clearly labeled pseudocode or algorithm blocks.
Open Source Code Yes To support these efforts, we will release our model test bed and evaluation code in a ready-to-use package, allowing researchers to run their own evaluations with just 4 lines of code.
Open Datasets Yes Image Net (Russakovsky et al., 2015), the standard benchmark for object recognition, has set the bar for progress in computer vision. ... Image Net-A,-C and -R (Hendrycks et al., 2021b; Hendrycks and Dietterich, 2019; Hendrycks et al., 2021a)... Recently, two datasets of household objects spanning the globe were introduced: Dollar Street (Rojas et al.) and Geo DE (Ramaswamy et al., 2023).
Dataset Splits No The paper mentions a 'training split' and 'evaluation set' for the last layer retraining experiment, but does not provide explicit training/validation/test splits for its broader analysis of 100 vision models or a distinct validation set for the retraining experiment.
Hardware Specification No The paper does not specify the hardware (e.g., GPU/CPU models, memory) used for running its experiments.
Software Dependencies No We primarily use weights available in the Timm library (Wightman, 2019) for Image Net trained models, use the Open CLIP library for CLIP models (Ilharco et al., 2021), and use Hugging Face (Wolf et al., 2020) implementations of other foundation models. For data augmentation, for all models we used the Image Net normalization available in Py Torch, resize images to 256 pixels, and center crop to 224 pixels.
Experiment Setup Yes We train the last layer for 5 epochs using Adam optimizer, learning rate 10 5 and batch size 32.