Natural Language Descriptions of Deep Visual Features

Authors: Evan Hernandez, Sarah Schwettmann, David Bau, Teona Bagashvili, Antonio Torralba, Jacob Andreas

ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experiments highlight three: using MILAN-generated descriptions to (1) analyze the role and importance of different neuron classes in convolutional image classifiers, (2) audit models for demographically sensitive feature by comparing their features when trained on anonymized (blurred) and non-anonymized datasets, and (3) identify and mitigate the effects of spurious correlations with text features, improving classifier performance on adversarially distributed test sets.
Researcher Affiliation Academia 1MIT CSAIL 2Northeastern University 3Allegheny College
Pseudocode No The paper describes the MILAN procedure using mathematical equations and descriptive text, but it does not include a clearly labeled pseudocode block or algorithm.
Open Source Code Yes Code, data, and an interactive demonstration may be found at http://milan.csail.mit.edu/.
Open Datasets Yes These models cover two datasets, specifically Image Net (Deng et al., 2009) and Places365 (Zhou et al., 2017), as well as two completely different families of models, CNNs and Vision Transformers (Vi T) (Dosovitskiy et al., 2021).
Dataset Splits Yes To test generalization within a network, we train on 90% of neurons from each network and test on the remaining 10%. AND Training details can be found in Appendix E. ... holding out 10% of the training data as a validation dataset for early stopping.
Hardware Specification Yes We also thank IBM for the donation of the Satori supercomputer that enabled training Big GAN on MIT Places. AND a hardware gift from NVIDIA under the NVAIL grant program.
Software Dependencies No The paper mentions software like "Py Torch Paszke et al. (2019)" and
Experiment Setup Yes The model is trained to minimize cross entropy on the training set using the Adam W optimizer Loshchilov & Hutter (2019) with a learning rate of 1e-3 and minibatches of size 64. AND Hyperparameters We train a randomly initialized Res Net18 on the spurious training dataset for a maximum of 100 epochs with a learning rate of 1e-4 and a minibatch size of 128.