Asynchronous Decentralized Online Learning

Authors: Jiyan Jiang, Wenpeng Zhang, Jinjie GU, Wenwu Zhu

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments show that AD-OGP runs significantly faster than its synchronous counterpart and also verify the theoretical results.
Researcher Affiliation Collaboration Jiyan Jiang Tsinghua University scjjy95@outlook.com Wenpeng Zhang Ant Group zhangwenpeng0@gmail.com Jinjie Gu Ant Group jinjie.gujj@antgroup.com Wenwu Zhu Tsinghua University wwzhu@tsinghua.edu.cn
Pseudocode Yes Protocol 1 Asynchronous Decentralized Online Convex Optimization (AD-OCO) ... Algorithm 1 Asynchronous Decentralized Online Gradient-Push (AD-OGP)
Open Source Code No The paper does not include an unambiguous statement that the authors are releasing the code for the work described, nor does it provide a direct link to a source-code repository.
Open Datasets Yes We select two large-scale real-world datasets. (i) The higgs dataset is a benchmark dataset in highenergy physics [4] for binary classification, which consists of 11 million instances with 28 features. (ii) The poker-hand dataset is a commonly used dataset in automatic rule generation [6, 7] for 10-class classification, which has 1 million instances with 25 features.
Dataset Splits No The paper mentions using 'higgs' and 'poker-hand' datasets but does not specify exact split percentages or sample counts for training, validation, or test sets, nor does it reference predefined splits with citations.
Hardware Specification No The paper does not provide specific hardware details (exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments.
Software Dependencies No The paper does not provide specific ancillary software details, such as library or solver names with version numbers, needed to replicate the experiment.
Experiment Setup Yes For binary classification, we use the logistic loss; for multi-class classification, we use the multivariate logistic loss [11] (see detailed definitions in supplementary materials). ... Moreover, we adopt the commonly used L2-norm balls as decision sets and set their diameters as 100. The learning rate η is set as what the corresponding theory suggests (see more details in supplementary materials).