Distributed Nyström Kernel Learning with Communications

Authors: Rong Yin, Weiping Wang, Dan Meng

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section, we report numerical results to verify the theoretical statements about the power of communications in DKRR-NY-CM on simulated dataset. The testing results are shown in Figure 1, which can be summarized as follows: 1) The larger the p is, the larger the gaps between the distributed algorithms (DKRR-NY and DKRR-NY-CM) and KRR are. When p is larger than an upper bound, MSE of distributed algorithms is far from the exact KRR. This verifies the statement about p in Theorem 1, 2, and 3. 2) The upper bound of p in our DKRR-NY-CM is much larger than that of DKRR-NY. This result verifies Theorem 3 that the communication strategy can relax the restriction on p. 3) The upper bound of p is increasing with the number of communication increasing, which shows the effectiveness of communication and is consistent with our theoretical analysis in Eq.(23) of Theorem 3.
Researcher Affiliation Academia 1Institute of Information Engineering, Chinese Academy of Sciences, Beijing, China 2School of Cyber Security, University of Chinese Academy of Sciences, Beijing, China 3Gaoling School of Artificial Intelligence, Renmin University of China, Beijing, China 4Beijing Key Laboratory of Big Data Management and Analysis Methods, Beijing, China.
Pseudocode Yes Algorithm 1 DKRR-NY with Communications (DKRR-NY-CM)
Open Source Code No The paper does not provide any statement about releasing source code or a link to a code repository for the described methodology.
Open Datasets No The way of generating the synthetic data is as below. The training samples {xi}N i=1 and the testing samples {x i}N i=1 are independently drawn according to the uniform distribution on the (hyper-)cube [0, 1]... The way of generating dataset is the same as (Lin et al., 2020).
Dataset Splits No Generating 20000 training samples and 2000 testing samples.
Hardware Specification No The paper does not provide specific details about the hardware used for running the experiments.
Software Dependencies No The paper does not provide specific software dependencies with version numbers.
Experiment Setup Yes According to the proposed Theorem, we set sampling scale m = N and λ = 1 2 N . In the training process of distributed algorithms, we uniformly distribute N training samples to p local processors. The criterion is the mean square error (MSE). ...repeat the training 5 times, and estimate the averaged error on testing samples.