Similarity and Matching of Neural Network Representations
Authors: Adrián Csiszárik, Péter Kőrösi-Szabó, Ákos Matszangosz, Gergely Papp, Dániel Varga
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We employ a toolset dubbed Dr. Frankenstein to analyse the similarity of representations in deep neural networks. ... We demonstrate that the inner representations emerging in deep convolutional neural networks with the same architecture but different initializations can be matched with a surprisingly high degree of accuracy even with a single, affine stitching layer. ... Our results in this section can be summarized in the following statement: Neural representations arising on a given layer of convolutional networks that share the same architecture but differ in initialization can be matched with a single affine stitching layer, achieving close to original performance on the stitched network. |
| Researcher Affiliation | Academia | Alfréd Rényi Insititute of Mathematics, Budapest, Hungary Eötvös Loránd University, Budapest, Hungary {csadrian, koszpe, matszang, gergopool, daniel}@renyi.hu |
| Pseudocode | No | The paper does not contain any pseudocode or clearly labeled algorithm blocks. |
| Open Source Code | Yes | A brief outline of this work1 is the following: 1Code is available at the project website: https://bit.ly/similarity-and-matching. |
| Open Datasets | Yes | We conduct experiments on three different convolutional architectures: a simple, 10-layer convnet called Tiny-10 (used in Kornblith et al. [2019]), on a Res Net-20 [He et al., 2016], and on the Inception V1 network [Szegedy et al., 2015]. These networks are rather standard and represent different types of architectures. Tiny-10 and Res Net-20 are trained and evaluated on CIFAR-10, Inception V1 on the 40-label Celeb A task. |
| Dataset Splits | Yes | Tiny-10 and Res Net-20 are trained and evaluated on CIFAR-10, Inception V1 on the 40-label Celeb A task. Further training details follow common practices (see Appendix A.3). ... Did you specify all the training details (e.g., data splits, hyperparameters, how they were chosen)? [Yes] The Appendix specifies the training details according to the standards of the field. The source code is public. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU/CPU models, memory, or cloud instance types. It only mentions, 'Did you include the total amount of compute and the type of resources used (e.g., type of GPUs, internal cluster, or cloud provider)? [Yes] In Appendix A.' However, Appendix A is not provided in the given text. |
| Software Dependencies | No | The paper mentions common deep learning frameworks implicitly through the architectures (e.g., ResNet, Inception) and general training practices, but it does not specify any software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions). The response to the ethics checklist only states 'The source code is public' without listing dependencies. |
| Experiment Setup | Yes | Further training details follow common practices (see Appendix A.3). ... We use the outputs of Model 2 as a soft label [Hinton et al., 2015] and define the task loss with cross-entropy. ... In the case of task loss matching, we found the most stable results when we initialized the training of the stitching layer from the least squares matching (2). ... The Appendix specifies the training details according to the standards of the field. |