Does Progress On Object Recognition Benchmarks Improve Generalization on Crowdsourced, Global Data?
Authors: Megan Richards, Polina Kirichenko, Diane Bouchacourt, Mark Ibrahim
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We perform a comprehensive empirical study on two crowdsourced, globally representative datasets, evaluating nearly 100 vision models to uncover several concerning empirical trends |
| Researcher Affiliation | Collaboration | Meta AI1, New York University2 |
| Pseudocode | No | The paper does not contain any clearly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | To support these efforts, we will release our model test bed and evaluation code in a ready-to-use package, allowing researchers to run their own evaluations with just 4 lines of code. |
| Open Datasets | Yes | Image Net (Russakovsky et al., 2015), the standard benchmark for object recognition, has set the bar for progress in computer vision. ... Image Net-A,-C and -R (Hendrycks et al., 2021b; Hendrycks and Dietterich, 2019; Hendrycks et al., 2021a)... Recently, two datasets of household objects spanning the globe were introduced: Dollar Street (Rojas et al.) and Geo DE (Ramaswamy et al., 2023). |
| Dataset Splits | No | The paper mentions a 'training split' and 'evaluation set' for the last layer retraining experiment, but does not provide explicit training/validation/test splits for its broader analysis of 100 vision models or a distinct validation set for the retraining experiment. |
| Hardware Specification | No | The paper does not specify the hardware (e.g., GPU/CPU models, memory) used for running its experiments. |
| Software Dependencies | No | We primarily use weights available in the Timm library (Wightman, 2019) for Image Net trained models, use the Open CLIP library for CLIP models (Ilharco et al., 2021), and use Hugging Face (Wolf et al., 2020) implementations of other foundation models. For data augmentation, for all models we used the Image Net normalization available in Py Torch, resize images to 256 pixels, and center crop to 224 pixels. |
| Experiment Setup | Yes | We train the last layer for 5 epochs using Adam optimizer, learning rate 10 5 and batch size 32. |