reproducibilityindex.ai

Outsourcing Training without Uploading Data via Efficient Collaborative Open-Source Sampling

Authors: Junyuan Hong, Lingjuan Lyu, Jiayu Zhou, Michael Spranger

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive empirical studies show that the proposed ECOS improves the quality of automated client labeling, model compression, and label outsourcing when applied in various learning scenarios.
Researcher Affiliation	Collaboration	Junyuan Hong Michigan State University hongju12@msu.edu Lingjuan Lyu Sony AI lingjuan.lv@sony.com Jiayu Zhou Michigan State University jiayuz@msu.edu Michael Spranger Sony AI michael.spranger@sony.com
Pseudocode	Yes	Algorithm 1 Efﬁcient collaborative open-source sampling (ECOS)
Open Source Code	No	We include the instructions but not codes for reproducing results.
Open Datasets	Yes	We use datasets from two tasks: digit recognition and object recognition. Distinct from prior work [56], in our work, the open-source data contains samples out of the client s distribution. With the same classes as the client dataset, we assume open-source data are from different environments and therefore include different feature distributions, for example, Domain Net [41] and Digits [28].
Dataset Splits	Yes	Splits of client and cloud datasets. For Digits, we use one domain for the client and the rest domains for the cloud as open-source set. For Domain Net, we randomly select 50% samples from one domain for the client and leave the rest samples together with all other domains to the cloud. Each experiment case is repeated for three times with seeds {1, 2, 3}.
Hardware Specification	No	The paper mentions 'powerful cloud server' and 'low-power and cost-effective end device' but does not specify any exact GPU or CPU models, or detailed cloud resource types used for the experiments in the main text.
Software Dependencies	No	The paper refers to various models and methods like Fix Match, KMeans, ResNet50, and private kNN, but it does not specify software dependencies such as programming languages, deep learning frameworks (e.g., PyTorch, TensorFlow), or other libraries with their respective version numbers.
Experiment Setup	Yes	Details of hyper-parameters are deferred to Appendix B.1. on the selected samples, we train a linear classiﬁer head for 30 epochs under the supervision of true labels and the teacher model ft, and then ﬁne-tune the full network fs for 500 epochs.