Distributed Randomized Sketching Kernel Learning

Authors: Rong Yin, Yong Liu, Dan Meng8883-8891

AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental An extensive experiment validates the effectiveness of DKRR-RS and the communication strategy on real datasets.In this section, we present an extensive experiment on the commonly datasets to verify our theoretical predictions. The empirical evaluations of DKRR-RS and DKRR-RSCM use Gaussian kernel, e 1 2h2 (x1 x2)2, on cadata (20640 samples), shuttle (43500 samples), w8a (49749 samples), and connect-4 (67557 samples) datasets2, where the optimal h 2[ 2:0.5:5] and λ 2[ 16:3: 4] are selected via 5fold cross-validation.
Researcher Affiliation Academia Rong Yin,1,2 Yong Liu,3,4 Dan Meng1,2 1 Institute of Information Engineering, Chinese Academy of Sciences, Beijing, China 2 School of Cyber Security, University of Chinese Academy of Sciences, Beijing, China 3 Gaoling School of Artificial Intelligence, Renmin University of China, Beijing, China 4 Beijing Key Laboratory of Big Data Management and Analysis Methods, Beijing, China
Pseudocode Yes Algorithm 1: DKRR-RS with Communications (DKRR-RS-CM)
Open Source Code No The paper mentions code sources for comparison algorithms (e.g., 'The code is from the author of (Yang, Pilanci, and Wainwright 2017)'), but does not explicitly state that the code for their proposed methodology (DKRR-RS or DKRR-RS-CM) is publicly available or provide a link to it.
Open Datasets Yes The empirical evaluations of DKRR-RS and DKRR-RSCM use Gaussian kernel, e 1 2h2 (x1 x2)2, on cadata (20640 samples), shuttle (43500 samples), w8a (49749 samples), and connect-4 (67557 samples) datasets2, where the optimal h 2[ 2:0.5:5] and λ 2[ 16:3: 4] are selected via 5fold cross-validation. The datasets are normalized with 70% samples used for training and the rest for testing. 2They are from https://www.csie.ntu.edu.tw/ cjlin/libsvmtools /datasets/
Dataset Splits Yes The datasets are normalized with 70% samples used for training and the rest for testing. The optimal h 2[ 2:0.5:5] and λ 2[ 16:3: 4] are selected via 5fold cross-validation.
Hardware Specification Yes The experiments are repeated 5 times with a server of 32 cores (2.40GHz) and 32 GB of RAM.
Software Dependencies No The paper mentions the use of 'Gaussian kernel' but does not specify any software names with version numbers (e.g., Python, PyTorch, TensorFlow, scikit-learn, etc.) that would be necessary to replicate the experiments.
Experiment Setup Yes The empirical evaluations of DKRR-RS and DKRR-RSCM use Gaussian kernel, e 1 2h2 (x1 x2)2, on cadata (20640 samples), shuttle (43500 samples), w8a (49749 samples), and connect-4 (67557 samples) datasets2, where the optimal h 2[ 2:0.5:5] and λ 2[ 16:3: 4] are selected via 5fold cross-validation. The datasets are normalized with 70% samples used for training and the rest for testing.