Efficient Distributed Learning with Sparsity

Authors: Jialei Wang, Mladen Kolar, Nathan Srebro, Tong Zhang

ICML 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section we present empirical comparisons between various approaches on both simulated and real world datasets. We run the algorithms for both distributed regression and classification problems, and compare with the following algorithms: i) Local; ii) Centralize; iii) Distributed proximal gradient descent (Prox GD); iv) Avg Debias (Lee et al., 2015b) with hard thresholding, and v) the proposed EDSL approach. 5.1. Simulations 5.2. Real-world Data Evaluation
Researcher Affiliation Collaboration Jialei Wang 1 Mladen Kolar 1 Nathan Srebro 2 Tong Zhang 3 1University of Chicago, USA 2Toyota Technological Institute at Chicago, USA 3Tencent AI Lab, China.
Pseudocode Yes Algorithm 1 Efficient Distributed Sparse Learning (EDSL). Input: Data txji, yjiuj Prms,i Prns, loss function ℓp , q. Initialization: The master obtains bβ0 by minimizing (3), and broadcast bβ0 to every worker. for t 0, 1, . . . do Workers: for j 2, 3, . . . , m do if Receive bβt from the master then Calculate gradient Ljp bβtq and send it to the master. end end Master: if Receive t Ljp bβtqum j 2 from all workers then Obtain bβt 1 by solving the shifted ℓ1 regularized problem in (4). Broadcast bβt 1 to every worker. end end
Open Source Code No No statement regarding the public release or availability of source code for the described methodology was found.
Open Datasets No The paper mentions common datasets like "mnist", "connect4", "dna", and "mushrooms" for real-world data evaluation but does not provide specific access information (links, DOIs, or formal citations with authors and year) for these datasets.
Dataset Splits Yes For all data sets, we use 60% of data for training, 20% as held-out validation set for tuning the parameters, and the remaining 20% for testing.
Hardware Specification No No specific hardware details (e.g., GPU/CPU models, memory specifications) used for running experiments are described in the paper.
Software Dependencies No No specific software dependencies with version numbers are mentioned in the paper.
Experiment Setup No The paper mentions problem settings like 'n, p, m, s' and data generation parameters (e.g., covariance matrix details) and data splits, but does not provide specific hyperparameter values (e.g., learning rate, batch size, epochs) or optimizer settings for the algorithms evaluated.