Reducing Communication for Split Learning by Randomized Top-k Sparsification
Authors: Fei Zheng, Chaochao Chen, Lingjuan Lyu, Binhui Yao
IJCAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirical results show that compared with other communication-reduction methods, our proposed randomized top-k sparsification achieves a better model performance under the same compression level. |
| Researcher Affiliation | Collaboration | Fei Zheng1 , Chaochao Chen1 , Lingjuan Lyu2 , Binhui Yao3 1Zhejiang University 2Sony AI 3Midea Group |
| Pseudocode | No | The paper describes procedures and functions but does not present them in a structured pseudocode block or algorithm section. |
| Open Source Code | No | The paper does not contain any explicit statements or links indicating that source code for the described methodology is publicly available. |
| Open Datasets | Yes | We perform the experiments of different compression methods on four datasets, i.e., CIFAR-100 [Krizhevsky et al., 2009], Yoo Choose [Ben-Shimon et al., 2015], DBPedia [Auer et al., 2007], and Tiny-Imagenet [Le and Yang, 2015] |
| Dataset Splits | No | The paper uses standard datasets but does not explicitly detail the proportions or methodology for splitting data into training, validation, and test sets for all experiments, nor does it specify how validation was performed or how folds were created for cross-validation. |
| Hardware Specification | No | The paper does not provide specific details about the hardware (e.g., GPU/CPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers (e.g., programming languages, libraries, frameworks). |
| Experiment Setup | Yes | We set α to 0.1 for all tasks except for Yoo Choose, where α = 0.05. For CIFAR-100 and Tiny-Imagenet, data augmentation including random cropping and flipping is used. The size of GRU layer is set to 300. For Text CNN, we set kernel sizes to [3,4,5] and use the Glove word embeddings [Pennington et al., 2014] as initialization. |