Contrastive Learning Can Find An Optimal Basis For Approximately View-Invariant Functions
Authors: Daniel D. Johnson, Ayoub El Hanchi, Chris J. Maddison
ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conclude by studying the behavior of contrastive learning models on two synthetic tasks for which the exact positive-pair kernel is known, and investigating the extent to which the basis of eigenfunctions can be extracted from trained models. As predicted by our theory, we find that the same eigenfunctions can be recovered from multiple model parameterizations and losses, although the accuracy depends on both kernel parameterization expressiveness and augmentation strength. |
| Researcher Affiliation | Collaboration | Daniel D. Johnson1,2, Ayoub El Hanchi1, Chris J. Maddison1 1University of Toronto, 2Google Research |
| Pseudocode | No | The paper describes procedures and algorithms in prose but does not include any clearly labeled pseudocode blocks or algorithm listings. |
| Open Source Code | No | The paper cites external libraries like JAX and Flax with their GitHub URLs, but it does not provide an explicit statement or link for the authors' own implementation code related to the methodology described in the paper. |
| Open Datasets | Yes | Our first dataset is a simple overlapping regions toy problem... our second dataset is derived from MNIST (Le Cun et al., 2010), but with a carefully-chosen augmentation process so that computing K+ is tractable. |
| Dataset Splits | Yes | For each, we fit a regularized linear predictor on 160 labeled training examples (16 augmented samples from each class), using 160 additional validation examples to tune the regularization strength. |
| Hardware Specification | No | The paper mentions implementation using JAX and FLAX libraries but does not specify any hardware details like GPU or CPU models, or cloud computing resources. |
| Software Dependencies | No | The paper mentions "Adam optimizer (Kingma & Ba, 2014)", "JAX and FLAX libraries (Bradbury et al., 2018; Heek et al., 2020)", "Sci Kit Learn", and "Numpy". However, it does not specify version numbers for any of these software dependencies. |
| Experiment Setup | Yes | We train all methods for 12,000 steps using a batch size of 1024 independently-sampled positive pairs per iteration, using the Adam optimizer (Kingma & Ba, 2014) with a cosine-decay learning rate schedule. |