reproducibilityindex.ai

Distributed Randomized Sketching Kernel Learning

Authors: Rong Yin, Yong Liu, Dan Meng8883-8891

AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	An extensive experiment validates the effectiveness of DKRR-RS and the communication strategy on real datasets.In this section, we present an extensive experiment on the commonly datasets to verify our theoretical predictions. The empirical evaluations of DKRR-RS and DKRR-RSCM use Gaussian kernel, e 1 2h2 (x1 x2)2, on cadata (20640 samples), shuttle (43500 samples), w8a (49749 samples), and connect-4 (67557 samples) datasets2, where the optimal h 2[ 2:0.5:5] and λ 2[ 16:3: 4] are selected via 5fold cross-validation.
Researcher Affiliation	Academia	Rong Yin,1,2 Yong Liu,3,4 Dan Meng1,2 1 Institute of Information Engineering, Chinese Academy of Sciences, Beijing, China 2 School of Cyber Security, University of Chinese Academy of Sciences, Beijing, China 3 Gaoling School of Artiﬁcial Intelligence, Renmin University of China, Beijing, China 4 Beijing Key Laboratory of Big Data Management and Analysis Methods, Beijing, China
Pseudocode	Yes	Algorithm 1: DKRR-RS with Communications (DKRR-RS-CM)
Open Source Code	No	The paper mentions code sources for comparison algorithms (e.g., 'The code is from the author of (Yang, Pilanci, and Wainwright 2017)'), but does not explicitly state that the code for their proposed methodology (DKRR-RS or DKRR-RS-CM) is publicly available or provide a link to it.
Open Datasets	Yes	The empirical evaluations of DKRR-RS and DKRR-RSCM use Gaussian kernel, e 1 2h2 (x1 x2)2, on cadata (20640 samples), shuttle (43500 samples), w8a (49749 samples), and connect-4 (67557 samples) datasets2, where the optimal h 2[ 2:0.5:5] and λ 2[ 16:3: 4] are selected via 5fold cross-validation. The datasets are normalized with 70% samples used for training and the rest for testing. 2They are from https://www.csie.ntu.edu.tw/ cjlin/libsvmtools /datasets/
Dataset Splits	Yes	The datasets are normalized with 70% samples used for training and the rest for testing. The optimal h 2[ 2:0.5:5] and λ 2[ 16:3: 4] are selected via 5fold cross-validation.
Hardware Specification	Yes	The experiments are repeated 5 times with a server of 32 cores (2.40GHz) and 32 GB of RAM.
Software Dependencies	No	The paper mentions the use of 'Gaussian kernel' but does not specify any software names with version numbers (e.g., Python, PyTorch, TensorFlow, scikit-learn, etc.) that would be necessary to replicate the experiments.
Experiment Setup	Yes	The empirical evaluations of DKRR-RS and DKRR-RSCM use Gaussian kernel, e 1 2h2 (x1 x2)2, on cadata (20640 samples), shuttle (43500 samples), w8a (49749 samples), and connect-4 (67557 samples) datasets2, where the optimal h 2[ 2:0.5:5] and λ 2[ 16:3: 4] are selected via 5fold cross-validation. The datasets are normalized with 70% samples used for training and the rest for testing.