Interpreting and Disentangling Feature Components of Various Complexity from DNNs

Authors: Jie Ren, Mingjie Li, Zexu Liu, Quanshi Zhang

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Datasets, DNNs & Implementation details. We used our method to analyze VGG-16 (Simonyan et al., 2017) and Res Net-8/14/18/20/32/34/44 (He et al., 2016).3 For simplicity, we limited our attention to coarse-grained and finegrained classification. We trained these DNNs based on the CIFAR-10 dataset (Krizhevsky et al., 2009) and the CUB200-2011 dataset (Wah et al., 2011).
Researcher Affiliation Academia 1Shanghai Jiao Tong University. 2Quanshi Zhang is the corresponding author. He is with the John Hopcroft Center and the Mo E Key Lab of Artificial Intelligence, AI Institute, at the Shanghai Jiao Tong University, China.
Pseudocode No No structured pseudocode or algorithm blocks were found in the paper.
Open Source Code No The paper does not contain an explicit statement about the release of its source code or a link to a code repository for the methodology described.
Open Datasets Yes We trained these DNNs based on the CIFAR-10 dataset (Krizhevsky et al., 2009) and the CUB200-2011 dataset (Wah et al., 2011).
Dataset Splits No The paper mentions using different numbers of training samples and a 'test set', but it does not provide explicit training/validation/test dataset splits (e.g., percentages, counts, or references to predefined splits for the main DNN training). It mentions a cross-validation setup for a regressor, but not for the primary DNN training.
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU/CPU models, memory, or cloud computing instances) used for running its experiments.
Software Dependencies No The paper does not provide specific software dependencies with version numbers (e.g., library names with version numbers like Python 3.8, PyTorch 1.9).
Experiment Setup Yes We design decomposer nets Φ(1)(x), . . . , Φ(L)(x) with residual architectures. The decomposer net consists of three types of residual blocks, each type having m blocks. Each block of the three types consists of a Re LU layer and a convolutional layer with 128γ, 256γ, 512γ channels, respectively. In most experiments, we set γ = 1, but in Figure 3(a), we try different values of γ to test decomposer nets of different widths. ... To boost the learning efficiency, we used parameters of the learned Φ(li) to initialize first li layers in Φ(li+1).