Evaluating Representations with Readout Model Switching
Authors: Yazhe Li, Jorg Bornschein, Marcus Hutter
ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The proposed metric can be efficiently computed with an online method and we present results for pre-trained vision encoders of various architectures (Res Net and Vi T) and objective functions (supervised and self-supervised) on a range of downstream tasks. We compare our methods with accuracy-based approaches and show that the latter are inconsistent when multiple readout models are used. |
| Researcher Affiliation | Industry | Yazhe Li yazhe@deepmind.com Jorg Bornschein bornschein@deepmind.com Marcus Hutter mhutter@deepmind.com |
| Pseudocode | Yes | Algorithm 1 MDL with Readout Model Switching; Algorithm 2 MDL with Readout Model Switching (2-Stage) |
| Open Source Code | No | The paper does not contain an explicit statement about the release of its source code or a link to a code repository for the methodology described. |
| Open Datasets | Yes | For this purpose, we use the VTAB benchmark (Zhai et al., 2019) as downstream tasks. Then, we use Image Net classification as downstream task to showcase the insights, such as data efficiency and preferred readout model, and for comparing in-domain and out-of-domain transfer, revealed by our evaluation method. |
| Dataset Splits | Yes | We split the dataset into training and validation where the validation set contains 10% of the data. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used to run the experiments. |
| Software Dependencies | No | The paper mentions optimizers like Adam W and SGD, but does not specify version numbers for any software dependencies, libraries, or frameworks used (e.g., Python, PyTorch, TensorFlow, CUDA versions). |
| Experiment Setup | Yes | Parameter Values Batch size 1024 Learning rate { 1e-4, 3e-4, 1e-3, 3e-3 } Weight decay { 0.0, 1e-6, 1e-4, 1e-2 } EMA step size (Polyak averaging) { 1e-4, 1e-2, 1.0 } Table 4: Hyperparameters for accuracy-based evaluations on VTAB. Parameter Values Batch size 32 Number of replay-streams { 3, 10, 30, 100 } Learning rate { 1e-4, 3e-4, 1e-3, 3e-3 } Adam W β1 { 0.5, 0.7, 0.9 } Weight decay { 0.0, 1e-6, 1e-4, 1e-2 } EMA step size (Polyak averaging) { 1e-4, 1e-2, 1.0 } Table 6: Hyperparameters for computing MDL of frozen encoders on VTAB. |