Emergence of Shape Bias in Convolutional Neural Networks through Activation Sparsity
Authors: Tianqin Li, Ziqi Wen, Yangfan Li, Tai Sing Lee
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrated this emergence of shape bias and its functional benefits for different network structures with various datasets. For object recognition convolutional neural networks, the shape bias leads to greater robustness against style and pattern change distraction. For the image synthesis generative adversary networks, the emerged shape bias leads to more coherent and decomposable structures in the synthesized images. |
| Researcher Affiliation | Academia | Tianqin Li Carnegie Mellon University tianqinl@cs.cmu.edu Ziqi Wen Carnegie Mellon University ziqiwen@cs.cmu.edu Yangfan Li Nortwestern University yangfanli2024@u.northwestern.edu Tai Sing Lee Carnegie Mellon University tai@cnbc.cmu.edu |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks explicitly labeled as such, or formatted as code-like steps. |
| Open Source Code | Yes | Our code is host at the github repository: https://topk-shape-bias.github.io/ |
| Open Datasets | Yes | We trained Res Net18 [13] on different subsets of Image Net dataset [8]. |
| Dataset Splits | Yes | Table 1: Evaluation for models trained on IN-S1 and IN-S2 datasets, each of which consists 10 classes of all train/val data from Image Net-1k dataset [8]. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used to run the experiments, such as GPU models, CPU types, or memory specifications. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers (e.g., specific library versions or programming language versions). |
| Experiment Setup | Yes | We trained Res Net-18 models over the selected IN-S1 and IN-S2 dataset with the standard Stochastic Gradient Descent (SGD, batch size 32) and a cosine annealing learning rate decay scheduling protocal with lr starting from 0.1. All models are then evaluated on the styleized version of IN-S1 and IN-S2 evaluation dataset after trained with 50 epochs. |