Does Learning from Decentralized Non-IID Unlabeled Data Benefit from Self Supervision?

Authors: Lirui Wang, Kaiqing Zhang, Yunzhu Li, Yonglong Tian, Russ Tedrake

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experiments show that the decentralized SSL (Dec-SSL) approach is robust to the heterogeneity of decentralized datasets, and learns useful representation for object classification, detection, and segmentation tasks, even when combined with the simple and standard decentralized learning algorithm of Federated Averaging (Fed Avg).
Researcher Affiliation Academia Lirui Wang, Kaiqing Zhang, Yunzhu Li, Yonglong Tian, Russ Tedrake MIT CSAIL
Pseudocode Yes We adopt the alignment regularization and clustering techniques, and developed a new Dec-SSL algorithm Feat ARC, summarized in Algorithm 1 and Algorithm 2 in Appendix.
Open Source Code Yes Code is available at https://github.com/liruiw/Dec-SSL
Open Datasets Yes We study the effectiveness of a range of contrastive learning algorithms under a decentralized learning setting, on relatively large-scale datasets including Image Net-100, MS-COCO, and a new real-world robotic warehouse dataset. Our experiments show that the decentralized SSL (Dec-SSL) approach is robust to the heterogeneity of decentralized datasets, and learns useful representation for object classification, detection, and segmentation tasks, even when combined with the simple and standard decentralized learning algorithm of Federated Averaging (Fed Avg).CIFAR-10 (Krizhevsky et al., 2009) and MS-COCO (Lin et al., 2014).
Dataset Splits Yes ImageNet-100 100 images per class for training, standard validation and test splits, MS-COCO default training and validation, Amazon 80% training, 20% testing (from Table 3 in Appendix C.1).
Hardware Specification No The paper states, 'We thank MIT Supercloud for providing compute resources,' but does not provide specific hardware details such as exact GPU or CPU models, processor types, or memory amounts used for running the experiments.
Software Dependencies No The paper mentions using PyTorch as the deep learning framework, but it does not specify the version number of PyTorch or any other software dependencies needed to replicate the experiment.
Experiment Setup Yes For Mask R-CNN, we use 1× schedule, a batch size of 2, a learning rate of 0.02, and Adam optimizer with weight decay 0.0001, momentum 0.9, and gradient clipping 0.1. We train for 90k iterations. For linear probing, we train for 100 epochs, with batch size 256, initial learning rate of 0.03 for ImageNet-100 and 0.1 for CIFAR-10.