Contrastive Alignment of Vision to Language Through Parameter-Efficient Transfer Learning
Authors: Zaid Khan, Yun Fu
ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We describe a series of experiments: we show that existing knowledge is conserved more strongly in parameter-efficient training and that parameter-efficient scaling scales with model and dataset size. |
| Researcher Affiliation | Academia | Zaid Khan, Yun Fu Northeastern University, Boston, USA {khan.za, y.fu}@northeastern.edu |
| Pseudocode | No | The paper does not contain explicit pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code and weights at https://github.com/codezakh/Lil T. |
| Open Datasets | Yes | Datasets We draw 591, 753 image-text pairs from the training set of COCO2014Lin et al. (2014), following the split of Karpathy & Fei-Fei (2017). |
| Dataset Splits | Yes | Datasets We draw 591, 753 image-text pairs from the training set of COCO2014Lin et al. (2014), following the split of Karpathy & Fei-Fei (2017). |
| Hardware Specification | Yes | We train each model with a batch size of 512 on 4x NVIDIA A6000 GPUs for 15 epochs, using the Adam W optimizer (Loshchilov & Hutter, 2017) optimizer with a weight decay of 0.02. |
| Software Dependencies | No | The paper mentions 'Adam W optimizer' but does not specify version numbers for general software dependencies (e.g., Python, PyTorch, TensorFlow) or any specific libraries/packages beyond the optimizer. |
| Experiment Setup | Yes | We train each model with a batch size of 512 on 4x NVIDIA A6000 GPUs for 15 epochs, using the Adam W optimizer (Loshchilov & Hutter, 2017) optimizer with a weight decay of 0.02. The learning rate is warned up to 1e 4 in the first 10 epochs, and then decayed to 1e 5. We use random crops of resolution 256 256 with Rand Augment(Cubuk et al., 2020), with colors transformations removed following Li et al. (2021a). |