Unveiling the Unseen: Identifiable Clusters in Trained Depthwise Convolutional Kernels

Authors: Zahra Babaiee, Peyman Kiasari, Daniela Rus, Radu Grosu

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Through an extensive analysis of millions of trained filters, with different sizes and from various models, we employed unsupervised clustering with autoencoders, to categorize these filters. Astonishingly, the patterns converged into a few main clusters, each resembling the difference of Gaussian (Do G) functions, and their first and second-order derivatives. Notably, we were able to classify over 95% and 90% of the filters from state-of-the-art Conv Next V2 and Conv Ne Xt models, respectively.
Researcher Affiliation Academia Zahra Babaiee TU Vienna & MIT zbabaiee@mit.edu Peyman M. Kiasari TU Vienna peyman.kiasari@tuwien.ac.at Daniela Rus MIT rus@mit.edu Radu Grosu TU Vienna radu.grosu@uwien.ac.at
Pseudocode No The paper does not include any explicit pseudocode blocks or algorithms labeled as such. It describes methods in narrative form.
Open Source Code No The paper does not include any statements about releasing code for the methodology described, nor does it provide links to a code repository.
Open Datasets Yes Through an extensive investigation of several model types and sizes, including regular CNNs and DS-CNNs, trained on Image Net-1k and Image Net-21k, we discovered a striking property of DSCNN kernels.
Dataset Splits No The paper mentions training on ImageNet-1k and ImageNet-21k but does not explicitly specify the train/validation/test splits, percentages, or method for data partitioning required to reproduce the experiments.
Hardware Specification No The paper does not provide any specific details about the hardware (e.g., CPU, GPU models, memory) used to run the experiments.
Software Dependencies No The paper does not provide specific version numbers for any software dependencies or libraries used in the experiments (e.g., PyTorch, TensorFlow, CUDA versions).
Experiment Setup Yes To carry out a comprehensive classification of the trained kernels, we developed an unsupervised clustering method using autoencoders. The primary objective was to project the kernels onto a compact hidden dimensional space and subsequently execute clustering within this dimensionally reduced space. Distinct models were trained for each kernel size of 5 5 and 7 7. For every distinct kernel size, kernels learned from diverse models, and scales exhibiting the corresponding size were assembled. The compilation comprised over one million kernels for each size category. The autoencoder consists of two main components: an encoder and a decoder. The encoder has four intermediate layers, each followed by a leaky rectified linear unit (Leaky Re LU) activation. The code layer employs a sigmoid activation, to map filters to values within [0,1]. The final decoder layer uses a tanh activation, to accurately reconstruct the original normalized filters in [-1,1]. We utilize a mean-centered cosine similarity loss to accommodate the invariance of filter patterns to linear transformations. [...] We select a threshold of 0.3 for 7x7 kernels and 0.2 for 5x5 kernels. We chose the Conv Mixer768-32 model, utilizing a 14-patch size, and trained it over 50 epochs. A second model was initialized with various Do G functions, using random variances similar to trained kernels. The initialization was uniformly distributed: 45% on-center, 10% off-center, 15% cross, 20% first derivative, and the rest second derivatives.