reproducibilityindex.ai

Federated Multi-Task Learning

Authors: Virginia Smith, Chao-Kai Chiang, Maziar Sanjabi, Ameet S. Talwalkar

NeurIPS 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	The resulting method achieves signiﬁcant speedups compared to alternatives in the federated setting, as we demonstrate through simulations on real-world federated datasets.Finally, we demonstrate the superior empirical performance of MOCHA with a new benchmarking suite of federated datasets.
Researcher Affiliation	Academia	Virginia Smith Stanford smithv@stanford.edu Chao-Kai Chiang USC chaokaic@usc.edu Maziar Sanjabi USC maziarsanjabi@gmail.com Ameet Talwalkar CMU talwalkar@cmu.edu
Pseudocode	Yes	Our method is given in Algorithm 1 and described in detail in Sections 3.3 and 3.4. Algorithm 1 is titled 'MOCHA: Federated Multi-Task Learning Framework'.
Open Source Code	Yes	Our code is available at: github.com/gingsmith/fmtl.
Open Datasets	Yes	Google Glass (GLEAM)4: This dataset consists of two hours of high resolution sensor data collected from 38 participants wearing Google Glass for the purpose of activity recognition. Following [41], we featurize the raw accelerometer, gyroscope, and magnetometer data into 180 statistical, spectral, and temporal features. We model each participant as a separate task, and predict between eating and other activities (e.g., walking, talking, drinking). 4http://www.skleinberg.org/data/GLEAM.tar.gz Human Activity Recognition5: Mobile phone accelerometer and gyroscope data collected from 30 individuals, performing one of six activities: {walking, walking-upstairs, walking-downstairs, sitting, standing, lying-down}. We use the provided 561-length feature vectors of time and frequency domain variables generated for each instance [3]. 5https://archive.ics.uci.edu/ml/datasets/Human+Activity+Recognition+Using+Smartphones Vehicle Sensor6: Acoustic, seismic, and infrared sensor data collected from a distributed network of 23 sensors, deployed with the aim of classifying vehicles driving by a segment of road [13]. Each instance is described by 50 acoustic and 50 seismic features. We model each sensor as a separate task and predict between AAV-type and DW-type vehicles. 6http://www.ecs.umass.edu/~mduarte/Software.html
Dataset Splits	Yes	For each dataset from Section 5.1, we randomly split the data into 75% training and 25% testing, and learn multi-task, local, and global support vector machine models, selecting the best regularization parameter, λ {1e-5, 1e-4, 1e-3, 1e-2, 0.1, 1, 10}, for each model using 5-fold cross-validation.
Hardware Specification	No	The paper discusses hardware capabilities (CPU, memory, network connection, power) as system challenges in federated learning, and mentions mobile phones, wearable devices, and smart homes as examples of where this technology could be applied. However, it does not specify any particular hardware (e.g., CPU or GPU models) used to run the simulations/experiments described in the paper.
Software Dependencies	No	The paper states 'Our code is available at: github.com/gingsmith/fmtl.' but does not specify any software dependencies or their version numbers (e.g., programming languages, libraries, frameworks).
Experiment Setup	Yes	For each dataset from Section 5.1, we randomly split the data into 75% training and 25% testing, and learn multi-task, local, and global support vector machine models, selecting the best regularization parameter, λ {1e-5, 1e-4, 1e-3, 1e-2, 0.1, 1, 10}, for each model using 5-fold cross-validation. We tune all compared methods for best performance, as we detail in Appendix E. In particular, we simulate systems heterogeneity by randomly choosing the number of local iterations for MOCHA or the mini-batch size for mini-batch methods, between 10% and 100% of the minimum number of local data points for high variability environments, to between 90% and 100% for low variability (see Appendix E for full details).