Towards domain-invariant Self-Supervised Learning with Batch Styles Standardization
Authors: Marin Scalbert, Maria Vakalopoulou, Florent Couzinie-Devy
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on several UDG datasets demonstrate that it significantly improves downstream task performances on unseen domains, often outperforming or rivaling UDG methods. |
| Researcher Affiliation | Collaboration | Marin Scalbert & Maria Vakalopoulou MICS, Centrale Sup elec, Universit e Paris-Saclay Gif-sur-Yvette, France {name.surname}@centralesupelec.fr Florent Couzini e-Devy Vita DX Paris, France f.couzinie-devy@vitadx.com |
| Pseudocode | Yes | On Algorithm 1 and Listing 1, a pseudo-code along with a Py Torch implementation of Batch Styles Standardization are provided. |
| Open Source Code | No | The full code will be released upon acceptance. |
| Open Datasets | Yes | To evaluate the extended SSL methods, experiments were conducted on 3 datasets commonly used for benchmarking DG / UDG methods, namely PACS, Domain Net and Camelyon17 WILDS. |
| Dataset Splits | Yes | Camelyon17 WILDS (Koh et al., 2021) includes images covering 2 classes (tumor, no tumor) from 5 domains (hospitals). It is split into train, val, and test subsets comprising respectively 3, 1, and 1 distinct domains. |
| Hardware Specification | Yes | This project was provided with computer and storage resources by GENCI at IDRIS thanks to the grant 2022-AD011013424R1 on the supercomputer Jean Zay s V100 partition. |
| Software Dependencies | No | The paper mentions using PyTorch in its pseudocode listing ('import torch') and specific optimizers like LARS and Adam, but does not provide version numbers for any software dependencies. |
| Experiment Setup | Yes | All other hyperparameters for Sim CLR, SWa V, MSN are respectively specified on Tables 5, 6, 7. For all datasets (PACS, Domain Net, and Camelyon17 WILDS), we use the Adam optimization method (Kingma & Ba, 2014) with an initial learning rate of 10-4, a learning rate scheduler with cosine decay, and weight decay of 10-4. The networks are trained respectively for 5K, 1K, and 15K steps with batch sizes of 128, 64, and 64. |