On the Joint Interaction of Models, Data, and Features
Authors: Yiding Jiang, Christina Baek, J Zico Kolter
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We introduce a new tool, the interaction tensor, for empirically analyzing the interaction between data and model through features. With the interaction tensor, we make several key observations about how features are distributed in data and how models with different random seeds learn different features. We begin with an empirical investigation of feature learning, using a natural definition of features on real data (Figure 1) that allows us to easily compare information about the data distribution learned by different models and a construction we propose called the interaction tensor. |
| Researcher Affiliation | Collaboration | Yiding Jiang Carnegie Mellon University yidingji@cs.cmu.edu Christina Baek Carnegie Mellon University kbaek@cs.cmu.edu J. Zico Kolter Carnegie Mellon University Bosch Center for AI zkolter@cs.cmu.edu |
| Pseudocode | Yes | Algorithm 1 Cluster Features |
| Open Source Code | No | The paper does not provide an explicit statement about releasing source code or a direct link to a code repository for its methodology. |
| Open Datasets | Yes | We use a collection of M = 20 Res Net18 (He et al., 2016a) trained on the CIFAR-10 dataset (Krizhevsky et al., 2009) |
| Dataset Splits | No | The paper mentions training and test sets but does not explicitly describe a validation set split for reproducibility. |
| Hardware Specification | Yes | All experiments in the paper are done on single Nvidia RTX 2080 s and RTX A6000 s. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers. |
| Experiment Setup | Yes | We train the 20 models with: initial learning rate: 0.1, weight decay: 0.0001, minibatch size: 100, data augmentation: No |