reproducibilityindex.ai

Deconfounded Representation Similarity for Comparison of Neural Networks

Authors: Tianyu Cui, Yogesh Kumar, Pekka Marttinen, Samuel Kaski

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We show that deconfounding the similarity metrics increases the resolution of detecting functionally similar neural networks across domains. Moreover, in real-world applications, deconfounding improves the consistency between CKA and domain similarity in transfer learning, and increases correlation between CKA and model out-of-distribution accuracy similarity.
Researcher Affiliation	Academia	Tianyu Cui Department of Computer Science Aalto University tianyu.cui@aalto.fi Yogesh Kumar Department of Computer Science Aalto University yogesh.kumar@aalto.fi Pekka Marttinen Department of Computer Science Aalto University pekka.marttinen@aalto.fi Samuel Kaski Department of Computer Science Aalto University and University of Manchester samuel.kaski@aalto.fi
Pseudocode	No	The paper describes algorithms and methods in prose and mathematical equations but does not include any clearly labeled pseudocode or algorithm blocks.
Open Source Code	Yes	Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [Yes] We provide the code in the supplemental material.
Open Datasets	Yes	Setup: We check if similarity measures can identify functionally similar NN representations from random NN representations. For each model block of Res Nets (containing 2-3 convolutional layers), we generate two distributions of similarities: the null distribution H0 and the alternative distribution H1. The H0 contains similarities between 50 pairs of random Res Nets on CIFAR-10 test set. ... Distribution H1 contains similarities between the pretrained Image Net NN and each of the 50 Res Nets trained on CIFAR-10 from scratch with different random initializations, on the same CIFAR-10 test as H0.
Dataset Splits	Yes	We compute the layer-wise CKA and d CKA between each FT model and the corresponding PT model on the test set of the target domain [22]. ... 2. Evaluate the OOD accuracy of each model on CIFAR-10-C [40], acc(fi), and select the most accurate Res Net as the reference model, f ; 3. Compute the similarity between each fi and f , s(fi, f ), of each block on CIFAR-10 test set (in-distribution similarity)...
Hardware Specification	Yes	In experiments, computing d CKA between two XLM-Ro BERTa models [38] takes 0.37 ± 0.11s longer than CKA for each layer on 3000 random English sentences with a single 2080Ti GPU.
Software Dependencies	No	The paper mentions using specific models like Res Nets, XLM-RoBERTa, EfficientNet-B0, and Distil-RoBERTa, and refers to PyTorch models in a footnote, but it does not specify exact version numbers for any software libraries, frameworks, or programming languages used.
Experiment Setup	Yes	Did you specify all the training details (e.g., data splits, hyperparameters, how they were chosen)? [Yes] See Appendix C.