Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

FedWMSAM: Fast and Flat Federated Learning via Weighted Momentum and Sharpness-Aware Minimization

Authors: Tianle Li, Yongzhi Huang, Linshan Jiang, Chang Liu, Qipeng Xie, Wenfeng Du, Lu Wang, Kaishun Wu

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conduct extensive experiments on multiple datasets and model architectures, and the results validate the effectiveness, adaptability, and robustness of our method, demonstrating its superiority in addressing the optimization challenges of Federated Learning.
Researcher Affiliation Academia Tianle Li College of Computer Science and Software Engineering Shenzhen University EMAIL Yongzhi Huang Data Science Analysis The Hong Kong University of Science and Technology (Guangzhou) EMAIL Linshan Jiang National University of Singapore EMAIL Chang Liu Nanyang Technological University, Singapore EMAIL Qipeng Xie The Hong Kong University of Science and Technology (Guangzhou) EMAIL Wenfeng Du Shenzhen University EMAIL Lu Wang Shenzhen University EMAIL Kaishun Wu The Hong Kong University of Science and Technology (Guangzhou) EMAIL
Pseudocode Yes Algorithm 1 Fed WMSAM
Open Source Code Yes Our code is available at https://github.com/Li-Tian-Le/Neurl PS_Fed WMSAM.
Open Datasets Yes For Fashion-MNIST [41], we use a Multi-Layer Perceptron (MLP) architecture. For CIFAR-10 [42], we use Res Net-18 [43] as the backbone, Res Net-34 [43] for CIFAR-100 [42], and Res Net-50 [43] for Office Home.
Dataset Splits Yes By default, we set pk,c Dir(β), where pk,c denotes the class distribution of client k over class c, and β = 0.1. The main experiments are conducted with 100 clients, 10% participation per round, a batch size of 50, a local learning rate ηl = 0.1, a global learning rate ηg = 1, and five local epochs, running for 500 communication rounds. For Fashion-MNIST [41], we use a Multi-Layer Perceptron (MLP) architecture. For CIFAR-10 [42], we use Res Net-18 [43] as the backbone, Res Net-34 [43] for CIFAR-100 [42], and Res Net-50 [43] for Office Home. Each domain in Office Home is divided into one client with 10% data sample rate and 100% active ratio.
Hardware Specification Yes All experiments were implemented in Py Torch and conducted on a workstation with four NVIDIA Ge Force RTX 3090 GPUs.
Software Dependencies No All experiments were implemented in Py Torch and conducted on a workstation with four NVIDIA Ge Force RTX 3090 GPUs.
Experiment Setup Yes The main experiments are conducted with 100 clients, 10% participation per round, a batch size of 50, a local learning rate ηl = 0.1, a global learning rate ηg = 1, and five local epochs, running for 500 communication rounds.