Hierarchical nucleation in deep neural networks
Authors: Diego Doimo, Aldo Glielmo, Alessio Ansuini, Alessandro Laio
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In the present paper, we analyse how the probability density of the data changes across the layers of a DCN. We consider in particular DCNs trained for classifying Image Net; as we will see, the complexity and heterogeneity of this dataset critically affects the results of our analysis. We extracted the activations of the training set of ILSVRC2012 from a selection of pretrained Py Torch models: Res Nets [19], Dense Nets [20], VGGs [21] and Google Net [22]. |
| Researcher Affiliation | Academia | Diego Doimo International School for Advanced Studies ddoimo@sissa.it Aldo Glielmo International School for Advanced Studies aglielmo@sissa.it Alessio Ansuini Area Science Park alessio.ansuini@areasciencepark.it Alessandro Laio International School for Advanced Studies laio@sissa.it |
| Pseudocode | No | The paper describes the methodology for estimating probability density and neighborhood overlap in text (Sec. 2.1 and 2.2) but does not include any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Reproducibility The source code of our experiments with the instructions required to run it on a selection layers is available at https://github.com/diegodoimo/hierarchical_nucleation as well as in the online supplementary material. |
| Open Datasets | Yes | We perform our analysis on the ILSVRC2012 dataset, a subset of 1000 mutually exclusive classes of Image Net which can be considered leaves of a hierarchical structure with 860 internal nodes. The analysis in this work is performed on a subset of 300 randomly chosen categories, including 300 images for each category, for a total of 90, 000 images. [38] J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei. Image Net: A Large-Scale Hierarchical Image Database. In CVPR09, 2009. |
| Dataset Splits | No | The paper states that the analysis is performed on 'a subset of 300 randomly chosen categories, including 300 images for each category, for a total of 90,000 images' from the ILSVRC2012 training set, and that they extracted activations from 'pretrained Py Torch models'. However, it does not explicitly specify any training, validation, or test dataset splits for their own analysis experiments. |
| Hardware Specification | No | The paper does not provide specific hardware details such as exact GPU/CPU models, processor types, or memory amounts used for running its experiments. |
| Software Dependencies | No | The paper mentions using 'pretrained Py Torch models' and references various methods and algorithms (e.g., CKA, WPGMA), but it does not specify concrete version numbers for any software dependencies, such as specific PyTorch versions or library versions. |
| Experiment Setup | Yes | We set k to one tenth of the number of images per class, but we verified that our findings are robust with respect to the choice of k over a wide range of values (see Sec. A.1). The analysis in this work is performed on a subset of 300 randomly chosen categories, including 300 images for each category, for a total of 90, 000 images. We extracted the activations of the training set of ILSVRC2012 from a selection of pretrained Py Torch models: Res Nets [19], Dense Nets [20], VGGs [21] and Google Net [22]. To compare architectures of different depths we will use as checkpoints the layers that downsample the channels and the final fully connected layers. Figure 2-a shows the behaviour of χl,out as a function of l for the checkpoint layers of the Res Net152 described in Sec. 2.3 (orange line). |