On Model Parallelization and Scheduling Strategies for Distributed Machine Learning
Authors: Seunghak Lee, Jin Kyu Kim, Xun Zheng, Qirong Ho, Garth A Gibson, Eric P Xing
NeurIPS 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate the efficacy of model-parallel algorithms implemented on STRADS versus popular implementations for topic modeling, matrix factorization, and Lasso. We conducted experiments on two clusters... |
| Researcher Affiliation | Academia | School of Computer Science Carnegie Mellon University Pittsburgh, PA 15213 seunghak@, jinkyuk@, xunzheng@, garth@, epxing@cs.cmu.edu Institute for Infocomm Research A*STAR Singapore 138632 hoqirong@gmail.com |
| Pseudocode | Yes | Figure 2: STRADS interface: Basic functional signatures of schedule, push, pull, using pseudocode. Figure 3: STRADS LDA pseudocode. Figure 5: STRADS MF pseudocode. Figure 6: STRADS Lasso pseudocode. |
| Open Source Code | No | The paper does not provide a statement or link indicating that its own source code is open or publicly available. It mentions using third-party tools like Open MPI. |
| Open Datasets | Yes | We used 3.9M English Wikipedia abstracts, and conducted experiments using both unigram (1-word) tokens (V = 2.5M unique unigrams, 179M tokens) and bigram (2-word) tokens [16] (V = 21.8M unique bigrams, 79M tokens). ... We used the Nexflix dataset [2] for our MF experiments: 100M anonymized ratings from 480,189 users on 17,770 movies. |
| Dataset Splits | No | The paper describes the datasets used and the experimental setup for convergence and scalability, but it does not specify explicit training, validation, or test dataset splits (e.g., percentages or exact counts for each split). |
| Hardware Specification | Yes | The 2-core cluster contains 128 machines, each with two 2.6GHz AMD cores and 8GB RAM, and connected via a 1Gbps network interface. The 16-core cluster contains 9 machines, each with 16 2.1GHz AMD cores and 64GB RAM, and connected via a 40Gbps network interface. |
| Software Dependencies | Yes | We implemented STRADS using C++ and the Boost libraries, and Open MPI 1.4.5 was used for asynchronous communication between the master schedulers, workers, and key-value stores. |
| Experiment Setup | Yes | for Lasso, we set λ = 0.001, and for MF, we set λ = 0.05. ... We set the number of topics to K = 5000 and 10000 (also larger than recent literature [1]). ... We varied the rank of W, H from K = 20 to 2000, which exceeds the upper limit of previous MF papers [26, 10, 24]. |