Emergence of Shape Bias in Convolutional Neural Networks through Activation Sparsity

Authors: Tianqin Li, Ziqi Wen, Yangfan Li, Tai Sing Lee

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrated this emergence of shape bias and its functional benefits for different network structures with various datasets. For object recognition convolutional neural networks, the shape bias leads to greater robustness against style and pattern change distraction. For the image synthesis generative adversary networks, the emerged shape bias leads to more coherent and decomposable structures in the synthesized images.
Researcher Affiliation Academia Tianqin Li Carnegie Mellon University tianqinl@cs.cmu.edu Ziqi Wen Carnegie Mellon University ziqiwen@cs.cmu.edu Yangfan Li Nortwestern University yangfanli2024@u.northwestern.edu Tai Sing Lee Carnegie Mellon University tai@cnbc.cmu.edu
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks explicitly labeled as such, or formatted as code-like steps.
Open Source Code Yes Our code is host at the github repository: https://topk-shape-bias.github.io/
Open Datasets Yes We trained Res Net18 [13] on different subsets of Image Net dataset [8].
Dataset Splits Yes Table 1: Evaluation for models trained on IN-S1 and IN-S2 datasets, each of which consists 10 classes of all train/val data from Image Net-1k dataset [8].
Hardware Specification No The paper does not provide specific details about the hardware used to run the experiments, such as GPU models, CPU types, or memory specifications.
Software Dependencies No The paper does not provide specific software dependencies with version numbers (e.g., specific library versions or programming language versions).
Experiment Setup Yes We trained Res Net-18 models over the selected IN-S1 and IN-S2 dataset with the standard Stochastic Gradient Descent (SGD, batch size 32) and a cosine annealing learning rate decay scheduling protocal with lr starting from 0.1. All models are then evaluated on the styleized version of IN-S1 and IN-S2 evaluation dataset after trained with 50 epochs.