Connect Later: Improving Fine-tuning for Robustness with Targeted Augmentations
Authors: Helen Qu, Sang Michael Xie
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate our framework on 4 real-world datasets: wildlife identification (IWILDCAM-WILDS, Beery et al., 2020; Sagawa et al., 2022), tumor detection (CAMELYON17-WILDS, Bandi et al., 2018; Sagawa et al., 2022) and 2 astronomical time series tasks, ASTROCLASSIFICATION and REDSHIFTS, which we curate from The PLAs Ti CC team et al. (2018). In Section 5, we show that Connect Later improves OOD performance over standard finetuning or supervised learning with targeted augmentations across all datasets. |
| Researcher Affiliation | Academia | Helen Qu 1 Sang Michael Xie 2 1Department of Physics and Astronomy, University of Pennsylvania 2Department of Computer Science, Stanford University. Correspondence to: Helen Qu <helenqu@sas.upenn.edu>. |
| Pseudocode | No | The paper describes methods using prose and mathematical equations but does not include any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not contain any explicit statements or links indicating the availability of open-source code for the described methodology. |
| Open Datasets | Yes | We evaluate our framework on 4 real-world datasets: wildlife identification (IWILDCAM-WILDS, Beery et al., 2020; Sagawa et al., 2022), tumor detection (CAMELYON17-WILDS, Bandi et al., 2018; Sagawa et al., 2022) and 2 astronomical time series tasks, ASTROCLASSIFICATION and REDSHIFTS, which we curate from The PLAs Ti CC team et al. (2018). [Footnote 2] https://zenodo.org/record/2539456 |
| Dataset Splits | No | For IWILDCAM-WILDS, we use a Res Net-50 model pretrained on unlabeled Image Net data with Sw AV contrastive learning (Caron et al., 2020). ... train all models for 15 epochs with early stopping on OOD validation performance... However, no explicit details on how the validation set was created (e.g., its size or split percentage) for the main experiments across all datasets. The "80/10/10 train/validation/test split" mentioned in Appendix D refers to an auxiliary experiment for connectivity measures, not the main training splits. |
| Hardware Specification | No | The paper does not specify the hardware used for running experiments, such as GPU or CPU models. |
| Software Dependencies | No | The paper mentions software components like 'Informer model' and 'Adam' optimizer but does not provide specific version numbers for general software dependencies (e.g., Python, PyTorch, CUDA) used in their experimental setup. |
| Experiment Setup | Yes | We perform pretraining with a batch size of 256 and learning rate 1e-4 (selected from 1e-3 1e-6) for 75,000 steps. We finetune the pretrained model with linear probing for 20,000 steps (for pretrained models only) and learning rate 1e-4, then fine-tuning for 10,000 steps at learning rate of 4e-5. ... For IWILDCAM-WILDS, we train all models for 15 epochs with early stopping... We sample the following hyperparameters independently from the following distributions: the linear probe learning rate (10Uniform[ 3, 2]), fine-tuning learning rate (10Uniform[ 5, 2]), and probability of applying the augmentation (Uniform[0.5, 0.9]) |