Shape or Texture: Understanding Discriminative Features in CNNs
Authors: Md Amirul Islam, Matthew Kowal, Patrick Esser, Sen Jia, Björn Ommer, Konstantinos G. Derpanis, Neil Bruce
ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this paper, we design a series of experiments that overcome these issues. We do this with the goal of better understanding what type of shape information contained in the network is discriminative, where shape information is encoded, as well as when the network learns about object shape during training. |
| Researcher Affiliation | Collaboration | 1Department of Computer Science, Ryerson University, Canada 2University of Waterloo, Canada 3IWR, HCI, Heidelberg University, Germany 4School of Computer Science, University of Guelph, Canada 5Samsung AI Centre Toronto, Canada 6Vector Institute for AI, Canada |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | All code will be released to reproduce data and results. |
| Open Datasets | Yes | PASCAL VOC 2012 dataset (Everingham et al., 2010), Image Net (IN) (Deng et al., 2009), Describable Textures Dataset (Cimpoi et al., 2014) |
| Dataset Splits | Yes | We use the trainaug and val split of the VOC 2012 dataset to train and test the read-out module, respectively. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used to run the experiments. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers. |
| Experiment Setup | Yes | We obtain 18 different instances of a Res Net50 model during training on IN and SIN, each representing a checkpoint between epochs 0 and 90 (equally distributed). and We use Res Net networks of various depths (i.e., 34, 50, and 101) as SENs with a readout module containing either one or three convolution layers with 3x3 kernels. and The vertical lines represent multiplying the learning rate by a factor of 0.1. |