On the Joint Interaction of Models, Data, and Features

Authors: Yiding Jiang, Christina Baek, J Zico Kolter

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We introduce a new tool, the interaction tensor, for empirically analyzing the interaction between data and model through features. With the interaction tensor, we make several key observations about how features are distributed in data and how models with different random seeds learn different features. We begin with an empirical investigation of feature learning, using a natural definition of features on real data (Figure 1) that allows us to easily compare information about the data distribution learned by different models and a construction we propose called the interaction tensor.
Researcher Affiliation Collaboration Yiding Jiang Carnegie Mellon University yidingji@cs.cmu.edu Christina Baek Carnegie Mellon University kbaek@cs.cmu.edu J. Zico Kolter Carnegie Mellon University Bosch Center for AI zkolter@cs.cmu.edu
Pseudocode Yes Algorithm 1 Cluster Features
Open Source Code No The paper does not provide an explicit statement about releasing source code or a direct link to a code repository for its methodology.
Open Datasets Yes We use a collection of M = 20 Res Net18 (He et al., 2016a) trained on the CIFAR-10 dataset (Krizhevsky et al., 2009)
Dataset Splits No The paper mentions training and test sets but does not explicitly describe a validation set split for reproducibility.
Hardware Specification Yes All experiments in the paper are done on single Nvidia RTX 2080 s and RTX A6000 s.
Software Dependencies No The paper does not provide specific software dependencies with version numbers.
Experiment Setup Yes We train the 20 models with: initial learning rate: 0.1, weight decay: 0.0001, minibatch size: 100, data augmentation: No