Navigating Data Heterogeneity in Federated Learning: A Semi-Supervised Federated Object Detection

Authors: Taehyeon Kim, Eric Lin, Junu Lee, Christian Lau, Vaikkunth Mugunthan

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive validation on prominent autonomous driving datasets (BDD100K, Cityscapes, and SODA10M) attests to the efficacy of our approach, demonstrating state-of-the-art results.
Researcher Affiliation Collaboration Taehyeon Kim1 Eric Lin2 Junu Lee3 Christian Lau2 Vaikkunth Mugunthan2 1KAIST 2Dynamo FL 3The Wharton School potter32@kaist.ac.kr
Pseudocode Yes Algorithm 1: Fed STO Algorithm within the SSFOD Framework
Open Source Code No The paper does not contain an unambiguous statement or a direct link indicating that the authors are releasing the source code for the methodology described in the paper. While it mentions YOLOv5, that is a third-party tool.
Open Datasets Yes Extensive validation on prominent autonomous driving datasets (BDD100K, Cityscapes, and SODA10M) attests to the efficacy of our approach, demonstrating state-of-the-art results. (BDD100K [41], Cityscapes [4], SODA10M [9])
Dataset Splits Yes For our studies, we employ the package, encompassing fine annotations for 3,475 images in the training and validation sets, and dummy annotations for the test set with 1,525 images.
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU models, CPU models, or cloud instance types) used for running its experiments.
Software Dependencies No The paper mentions "YOLOv5 Large model architecture" but does not specify version numbers for other key software components, libraries, or programming languages used.
Experiment Setup Yes Our training regimen spans 300 rounds: 50 rounds of warm-up, 100 rounds of pretraining (T1), and 150 rounds of orthogonal enhancement (T2). We use the YOLOv5 Large model architecture with Mosaic, left-right flip, large scale jittering, graying, Gaussian blur, cutout, and color space conversion augmentations. A constant learning rate of 0.01 was maintained. Binary sigmoid functions determined objectiveness and class probability with a balance ratio of 0.3 for class, 0.7 for object, and an anchor threshold of 4.0. The ignore threshold ranged from 0.1 to 0.6, with an Non-Maximum Suppression (NMS) confidence threshold of 0.1 and an Io U threshold of 0.65. We incorporate an exponential moving average (EMA) rate of 0.999 for stable model parameter representation.