ImageNet-X: Understanding Model Mistakes with Factor of Variation Annotations
Authors: Badr Youbi Idrissi, Diane Bouchacourt, Randall Balestriero, Ivan Evtimov, Caner Hazirbas, Nicolas Ballas, Pascal Vincent, Michal Drozdzal, David Lopez-Paz, Mark Ibrahim
ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Equipped with Image Net-X, we investigate 2,200 current recognition models and study the types of mistakes as a function of model s (1) architecture e.g. transformer vs. convolutional , (2) learning paradigm e.g. supervised vs. self-supervised , and (3) training procedures e.g. data augmentation. |
| Researcher Affiliation | Industry | Fundamental AI Research (FAIR), Meta AI {byoubi,marksibrahim}@meta.com |
| Pseudocode | No | No pseudocode or algorithm blocks were found in the paper. |
| Open Source Code | Yes | We release all the Image Net-X annotations along with an open-source toolkit to probe existing or new models failure types. The data and code are available at https://facebookresearch.github.io/imagenetx/site/home. |
| Open Datasets | Yes | To address this need, we introduce Image Net-X a set of sixteen human annotations of factors such as pose, background, or lighting for the entire Image Net-1k validation set as well as a random subset of 12k training images. |
| Dataset Splits | Yes | Image Net-X contains human annotations for each of the 50,000 images in the validation set of the Image Net dataset and 12,000 random sample from the training set. |
| Hardware Specification | No | No specific hardware details (e.g., GPU/CPU models, memory amounts, or detailed computer specifications) used for running experiments were provided. |
| Software Dependencies | No | The paper mentions data preprocessing using 'Pandas and Numpy, both freely available Python packages', but does not provide specific version numbers for these or other software dependencies. |
| Experiment Setup | Yes | Each run across all policies share the exact same optimizer (SGD), weight-decay (1e-5), mini-batch size (512), number of epochs (80), and data ordering through training. |