Representation Topology Divergence: A Method for Comparing Neural Network Representations.

Authors: Serguei Barannikov, Ilya Trofimov, Nikita Balabin, Evgeny Burnaev

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments show the proposed RTD agrees with the intuitive assessment of data representation similarity and is sensitive to its topological structure. We apply RTD to gain insights into neural network representations in computer vision and NLP domains for various problems: training dynamics analysis, data distribution shift, transfer learning, ensemble learning.
Researcher Affiliation Academia 1Skolkovo Institute of Science and Technology, Moscow, Russia 2CNRS, Universit e Paris Cit e, France 3Artificial Intelligence Research Institute (AIRI), Moscow, Russia
Pseudocode Yes The main steps of the computation are summarized in Algorithms 1 and 2.
Open Source Code Yes The source code is publicly available: https://github.com/IlyaTrofimov/RTD.
Open Datasets Yes we train Res Net-20 (He et al., 2016) and VGG-11 (Simonyan & Zisserman, 2014) networks on CIFAR (Krizhevsky et al., 2009) datasets.
Dataset Splits No The paper mentions using training and test datasets but does not explicitly specify validation splits, percentages, or methodology for creating such splits.
Hardware Specification No The paper mentions "GPU-optimized software" and "GPU acceleration" but does not provide specific details on the hardware used, such as GPU models, CPU types, or memory specifications.
Software Dependencies No The paper states that certain steps can be done using "scripts that are optimized for GPU acceleration (Zhang et al., 2020)", but it does not specify any software names with version numbers (e.g., Python, PyTorch, TensorFlow, specific libraries).
Experiment Setup Yes Table 8: Details on learning the neural networks from random initialization on CIFAR datasets. Number of epochs 100, Optimizer SGD, momentum=0.9, Learning rate (initial) 0.1, Scheduler <50%: 0.1 50-90%: 0.1-0.001 (linear) >90%: 0.001, Batch size 128. Table 9: Details on fine-tuning the Res Net-20 from CIFAR-100 to CIFAR-10 dataset.