Matrix Sketching for Secure Collaborative Machine Learning

Authors: Mengjiao Zhang, Shusen Wang

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conduct experiments to demonstrate that first, DBCL does not harm test accuracy, second, DBCL does not increase the communication cost too much, and third, DBCL can defend client-side gradient-based attacks.
Researcher Affiliation Academia 1 Department of Computer Science, Stevens Institute of Technology, Hoboken, NJ 07030.
Pseudocode No The paper describes the algorithm steps in paragraph format within Section 4.2 "Algorithm Description" but does not provide structured pseudocode or algorithm blocks.
Open Source Code Yes The source code is available at the Github repo: https://github.com/Mengjiao Zhang/DBCL
Open Datasets Yes Three datasets are used in the experiments. MNIST has 60,000 training images and 10,000 test images; each image is 28 28. CIFAR-10 has 50,000 training images and 10,000 test images; each image is 32 32 3. Labeled Faces In the Wild (LFW) has 13,233 faces of 5,749 individuals; each face is a 64 47 3 color image.
Dataset Splits No The paper mentions using standard datasets like MNIST, CIFAR-10, and LFW, and provides training parameters like batch size and epochs. It specifies the number of training and test images for these datasets (e.g., MNIST has 60,000 training images and 10,000 test images, LFW has 8,150 for training and 3,400 for test). However, it does not explicitly provide details about a separate validation set split (percentages or counts) beyond the implied train/test splits of standard benchmarks, nor does it specify how validation was performed if used.
Hardware Specification Yes The experiments are conducted on a server with 4 NVIDIA Ge Force Titan V GPUs, 2 Xeon Gold 6134 CPUs, and 192 GB memory.
Software Dependencies No The paper states "Our method and the compared methods are implemented using Py Torch" but does not provide specific version numbers for PyTorch or any other software dependencies.
Experiment Setup Yes The learning rates are tuned to optimize the convergence rate. The data are partitioned among 100 (virtual) clients uniformly at random. Between two communications, Fed Avg performs local computation for 1 epoch (for MLP) or 5 epochs (for CNN). The batch size of local SGD is set to 10. Sketching is applied to all the dense and convolutional layers except the output layer. We set the sketch size to s = din/2. The participation ratio to c = 10%. The batch size of local SGD is set to 50. The model is trained by distributed SGD (2 clients and 1 server) with a learning rate of 0.01 and a batch size of 32.