reproducibilityindex.ai

Efficient Distributed Learning with Sparsity

Authors: Jialei Wang, Mladen Kolar, Nathan Srebro, Tong Zhang

ICML 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section we present empirical comparisons between various approaches on both simulated and real world datasets. We run the algorithms for both distributed regression and classiﬁcation problems, and compare with the following algorithms: i) Local; ii) Centralize; iii) Distributed proximal gradient descent (Prox GD); iv) Avg Debias (Lee et al., 2015b) with hard thresholding, and v) the proposed EDSL approach. 5.1. Simulations 5.2. Real-world Data Evaluation
Researcher Affiliation	Collaboration	Jialei Wang 1 Mladen Kolar 1 Nathan Srebro 2 Tong Zhang 3 1University of Chicago, USA 2Toyota Technological Institute at Chicago, USA 3Tencent AI Lab, China.
Pseudocode	Yes	Algorithm 1 Efﬁcient Distributed Sparse Learning (EDSL). Input: Data txji, yjiuj Prms,i Prns, loss function ℓp , q. Initialization: The master obtains bβ0 by minimizing (3), and broadcast bβ0 to every worker. for t 0, 1, . . . do Workers: for j 2, 3, . . . , m do if Receive bβt from the master then Calculate gradient Ljp bβtq and send it to the master. end end Master: if Receive t Ljp bβtqum j 2 from all workers then Obtain bβt 1 by solving the shifted ℓ1 regularized problem in (4). Broadcast bβt 1 to every worker. end end
Open Source Code	No	No statement regarding the public release or availability of source code for the described methodology was found.
Open Datasets	No	The paper mentions common datasets like "mnist", "connect4", "dna", and "mushrooms" for real-world data evaluation but does not provide specific access information (links, DOIs, or formal citations with authors and year) for these datasets.
Dataset Splits	Yes	For all data sets, we use 60% of data for training, 20% as held-out validation set for tuning the parameters, and the remaining 20% for testing.
Hardware Specification	No	No specific hardware details (e.g., GPU/CPU models, memory specifications) used for running experiments are described in the paper.
Software Dependencies	No	No specific software dependencies with version numbers are mentioned in the paper.
Experiment Setup	No	The paper mentions problem settings like 'n, p, m, s' and data generation parameters (e.g., covariance matrix details) and data splits, but does not provide specific hyperparameter values (e.g., learning rate, batch size, epochs) or optimizer settings for the algorithms evaluated.