reproducibilityindex.ai

CaPC Learning: Confidential and Private Collaborative Learning

Authors: Christopher A. Choquette-Choo, Natalie Dullerud, Adam Dziedzic, Yunxiang Zhang, Somesh Jha, Nicolas Papernot, Xiao Wang

ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experiments on SVHN and CIFAR10 demonstrate that Ca PC enables participants to collaborate and improve the utility of their models, even in the heterogeneous setting where the architectures of their local models differ, and when there are only a few participants.
Researcher Affiliation	Collaboration	University of Toronto and Vector Institute {christopher.choquette.choo,natalie.dullerud}@mail.utoronto.ca ady@vectorinstitute.ai Yunxiang Zhang The Chinese University of Hong Kong yunxiang.zhang@ie.cuhk.edu.hk Somesh Jha University of Wisconsin-Madison and Xai Pient jha@cs.wisc.edu Nicolas Papernot University of Toronto and Vector Institute nicolas.papernot@utoronto.ca Xiao Wang Northwestern University wangxiao@cs.northwestern.edu
Pseudocode	No	The paper includes a protocol description in Figure 1, but it is a diagrammatic representation of steps, not structured pseudocode or an algorithm block.
Open Source Code	Yes	1 Code is available at: https://github.com/cleverhans-lab/capc-iclr.
Open Datasets	Yes	Our experiments on SVHN and CIFAR10 demonstrate that Ca PC enables participants to collaborate and improve the utility of their models... We use the following for experiments unless otherwise noted. We uniformly sample from the training set in use2, without replacement, to create disjoint partitions, Di, of equal size and identical data distribution for each party... We select K = 50 and K = 250 as the number of parties for CIFAR10 and SVHN, respectively (the number is larger for SVHN because we have more data).
Dataset Splits	No	The paper mentions training and testing, but does not explicitly provide details about a distinct validation split (percentages, counts, or methodology).
Hardware Specification	No	The paper mentions 'CPU' and 'GPU' in Table 1 and states 'HE-transformer only supports inference on CPUs', but it does not specify exact CPU or GPU models or other detailed hardware specifications used for experiments.
Software Dependencies	No	The paper mentions using 'HE-transformer library with MPC (MP2ML)' and 'The EMP toolkit' along with their respective citations, but it does not provide specific version numbers for these software components.
Experiment Setup	Yes	We use the following for experiments unless otherwise noted. We uniformly sample from the training set in use2, without replacement, to create disjoint partitions, Di, of equal size and identical data distribution for each party. We select K = 50 and K = 250 as the number of parties for CIFAR10 and SVHN, respectively (the number is larger for SVHN because we have more data). We select Q = 3 querying parties, Pi , and similarly divide part of the test set into Q separate private pools for each Pi to select queries, until their privacy budget of ϵ is reached (using Gaussian noise with σ = 40 on SVHN and 7 on CIFAR10). We ﬁx ϵ = 2 and 20 for SVHN and CIFAR10, respectively (which leads to 550 queries per party), and report accuracy on the evaluation set. Querying models are retrained on their Di plus the newly labelled data; the difference in accuracies is their accuracy improvement. We use shallower variants of VGG, namely VGG-5 and VGG-7 for CIFAR10 and SVHN, respectively, to accommodate the small size of each party s private dataset.