Towards Stability and Generalization Bounds in Decentralized Minibatch Stochastic Gradient Descent

Authors: Jiahuan Wang, Hong Chen

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical In this paper, we try to address this theoretical gap by investigating the generalization properties of DM-SGD. We establish the sharper generalization bounds for the DM-SGD algorithm with replacement (without replacement) on (non)convex and (non)smooth cases. Moreover, our results consistently recover to the results of Centralized Stochastic Gradient Descent (C-SGD).
Researcher Affiliation Academia 1College of Informatics, Huazhong Agricultural University, Wuhan, China 2Engineering Research Center of Intelligent Technology for Agriculture, Ministry of Education, Wuhan, China 3Hubei Engineering Technology Research Center of Agricultural Big Data, Wuhan, China chenh@mail.hzau.edu.cn
Pseudocode Yes Algorithm 1: Decentralized Minibatch Stochastic Gradient Descent (DM-SGD)
Open Source Code No The paper does not provide any explicit statement about releasing open-source code for the described methodology, nor does it provide a link to a code repository.
Open Datasets No The paper focuses on theoretical analysis and does not conduct experiments that use specific public datasets. While it defines 'training dataset' in the context of distributed learning, it does not provide access information for any dataset used for empirical evaluation.
Dataset Splits No The paper is theoretical and does not include experimental evaluation with dataset splits. No specific dataset split information (percentages, sample counts, or methodology) was provided.
Hardware Specification No The paper focuses on theoretical analysis and does not describe experiments that would require specific hardware. No hardware specifications were provided.
Software Dependencies No The paper is theoretical and does not describe empirical experiments that would require specific software dependencies with version numbers. No such details were provided.
Experiment Setup No The paper focuses on theoretical analysis and does not describe an experimental setup with specific hyperparameters or training configurations.