Improving neural network representations using human similarity judgments
Authors: Lukas Muttenthaler, Lorenz Linhardt, Jonas Dippel, Robert A. Vandermeulen, Katherine Hermann, Andrew Lampinen, Simon Kornblith
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We find that a naive approach leads to large changes in local representational structure that harm downstream performance. Thus, we propose a novel method that aligns the global structure of representations while preserving their local structure. This global-local transform considerably improves accuracy across a variety of few-shot learning and anomaly detection tasks. |
| Researcher Affiliation | Collaboration | Lukas Muttenthaler Google Deep Mind Machine Learning Group Technische Universität Berlin BIFOLD Berlin, Germany Lorenz Linhardt Machine Learning Group Technische Universität Berlin BIFOLD Berlin, Germany Jonas Dippel Machine Learning Group Technische Universität Berlin BIFOLD Berlin, Germany Robert A. Vandermeulen Machine Learning Group Technische Universität Berlin BIFOLD Berlin, Germany Katherine Hermann Google Deep Mind Mountain View, CA, USA Andrew K. Lampinen Google Deep Mind London, UK Simon Kornblith Google Deep Mind Toronto, Canada |
| Pseudocode | No | The paper describes methods using mathematical equations (e.g., Eq. 1, 3, 4, 5) but does not present any structured pseudocode or algorithm blocks. |
| Open Source Code | No | For extracting the model features, we use the Python library thingsvision [59]. |
| Open Datasets | Yes | Data. For measuring the degree of alignment between human and neural network similarity spaces, we use the THINGS dataset, which is a large behavioral dataset of 4.70 million unique triplet responses crowdsourced from 12,340 human participants for 1854 natural object images [32]. |
| Dataset Splits | Yes | We determine λ via grid-search using k-fold cross-validation (CV). |
| Hardware Specification | Yes | We used a compute time of approximately 400 hours on a single Nvidia A100 GPU with 40GB VRAM for all linear probing experiments including the hyperparameter sweep. |
| Software Dependencies | No | We use Py Torch [63] for implementing the probes and Py Torch lightning to accelerate training. |
| Experiment Setup | Yes | Specifically, we perform an extensive grid search over the Cartesian product of the following sets of hyperparameters: η {0.0001, 0.001, 0.01, 0.1}, λ {0.01, 0.1, 1.0, 10.0}, α {0.05, 0.1, 0.25, 0.5, 1.0}, τ {0.1, 0.25, 0.5, 1.0}. |