Outsourcing Training without Uploading Data via Efficient Collaborative Open-Source Sampling
Authors: Junyuan Hong, Lingjuan Lyu, Jiayu Zhou, Michael Spranger
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive empirical studies show that the proposed ECOS improves the quality of automated client labeling, model compression, and label outsourcing when applied in various learning scenarios. |
| Researcher Affiliation | Collaboration | Junyuan Hong Michigan State University hongju12@msu.edu Lingjuan Lyu Sony AI lingjuan.lv@sony.com Jiayu Zhou Michigan State University jiayuz@msu.edu Michael Spranger Sony AI michael.spranger@sony.com |
| Pseudocode | Yes | Algorithm 1 Efficient collaborative open-source sampling (ECOS) |
| Open Source Code | No | We include the instructions but not codes for reproducing results. |
| Open Datasets | Yes | We use datasets from two tasks: digit recognition and object recognition. Distinct from prior work [56], in our work, the open-source data contains samples out of the client s distribution. With the same classes as the client dataset, we assume open-source data are from different environments and therefore include different feature distributions, for example, Domain Net [41] and Digits [28]. |
| Dataset Splits | Yes | Splits of client and cloud datasets. For Digits, we use one domain for the client and the rest domains for the cloud as open-source set. For Domain Net, we randomly select 50% samples from one domain for the client and leave the rest samples together with all other domains to the cloud. Each experiment case is repeated for three times with seeds {1, 2, 3}. |
| Hardware Specification | No | The paper mentions 'powerful cloud server' and 'low-power and cost-effective end device' but does not specify any exact GPU or CPU models, or detailed cloud resource types used for the experiments in the main text. |
| Software Dependencies | No | The paper refers to various models and methods like Fix Match, KMeans, ResNet50, and private kNN, but it does not specify software dependencies such as programming languages, deep learning frameworks (e.g., PyTorch, TensorFlow), or other libraries with their respective version numbers. |
| Experiment Setup | Yes | Details of hyper-parameters are deferred to Appendix B.1. on the selected samples, we train a linear classifier head for 30 epochs under the supervision of true labels and the teacher model ft, and then fine-tune the full network fs for 500 epochs. |