Effective Parallelisation for Machine Learning

Authors: Michael Kamp, Mario Boley, Olana Missura, Thomas Gärtner

NeurIPS 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental The empirical evaluation of the Radon machine in Section 4 confirms its potential in practical settings. Given the same amount of data as the underlying learning algorithm, the Radon machine achieves a substantial reduction of computation time in realistic applications. Using 150 processors, the Radon machine is between 80 and around 700-times faster than the underlying learning algorithm on a single processing unit. Compared with parallel learning algorithms from Spark s MLlib, it achieves hypotheses of similar quality, while requiring only 15 85% of their runtime.
Researcher Affiliation Collaboration Michael Kamp University of Bonn and Fraunhofer IAIS kamp@cs.uni-bonn.de Mario Boley Max Planck Institute for Informatics and Saarland University mboley@mpi-inf.mpg.de Olana Missura Google Inc. olanam@google.com Thomas G artner University of Nottingham thomas.gaertner@nottingham.ac.uk
Pseudocode Yes Algorithm 1 Radon Machine
Open Source Code Yes The source code implementation in Spark can be found in the bitbucket repository https://bitbucket.org/Michael_Kamp/radonmachine.
Open Datasets Yes Moshe Lichman. UCI machine learning repository, 2013. URL http://archive.ics. uci.edu/ml. ... On the SUSY dataset (with 5 000 000 instances and 18 features), the Radon machine on 150 processors with h = 3 is 721 times faster than its base learning algorithms. ... On the CASP9 dataset, the Radon machine is 15% faster than the fastest Spark algorithm. ... On the dataset Year Prediction MSD, regularised least squares regression achieves an RMSE of 12.57, whereas the Radon machine achieved an RMSE of 13.64. ... We also compare the Radon machine on a multi-class prediction problem using conditional maximum entropy models. ... on two large multiclass datasets (drift and spoken-arabic-digit).
Dataset Splits Yes All results are obtained using 10-fold cross validation.
Hardware Specification No The experiments are executed on a Spark cluster (5 worker nodes, 25 processors per node). ... Using 150 processors, the Radon machine is between 80 and around 700-times faster than the underlying learning algorithm on a single processing unit. The paper mentions the number of processors and nodes in a Spark cluster but does not provide specific details on the CPU model, GPU, or memory used.
Software Dependencies No We use base learning algorithms from WEKA [44] and scikit-learn [29]. ... Compared with parallel learning algorithms from Spark s MLlib, it achieves hypotheses of similar quality, while requiring only 15 85% of their runtime. The paper names software libraries like WEKA, scikit-learn, and Spark's MLlib but does not specify their version numbers.
Experiment Setup Yes We apply the Radon machine with parameter h = 1 and the maximal parameter h such that each instance of the base learning algorithm is executed on a subset of size at least 100 (denoted h = max). All other parameters of the learning algorithms are optimised on an independent split of the datasets.