reproducibilityindex.ai

Fast Prediction for Large-Scale Kernel Machines

Authors: Cho-Jui Hsieh, Si Si, Inderjit S Dhillon

NeurIPS 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We apply our algorithm to real world large-scale classiﬁcation and regression datasets, and show that the proposed algorithm is consistently and signiﬁcantly better than other competitors. For example, on the Covertype classiﬁcation problem, in terms of prediction time, our algorithm achieves more than 10000 times speedup over the full kernel SVM, and a two-fold speedup over the state-of-the-art LDKL approach , while obtaining much higher prediction accuracy than LDKL (95.2% vs. 89.53%).
Researcher Affiliation	Academia	Cho-Jui Hsieh, Si Si, and Inderjit S. Dhillon Department of Computer Science University of Texas at Austin Austin, TX 78712 USA {cjhsieh,ssi,inderjit}@cs.utexas.edu
Pseudocode	Yes	Algorithm 1: Kernel Approximation with Pseudo Landmark Points. Our overall algorithm DC-Pred++ is presented in Algorithm 2. Algorithm 2: DC-Pred++: our proposed divide-and-conquer approach for fast Prediction.
Open Source Code	No	The paper does not provide any specific links to source code repositories or explicitly state that the source code for their methodology is available.
Open Datasets	Yes	We use six public datasets (shown in Table 1) for the comparison of kernel SVM prediction time. We further demonstrate the beneﬁts of DC-Pred++ for fast prediction in kernel ridge regression problem on ﬁve public datasets listed in Table 2.
Dataset Splits	Yes	The parameters γ, C are selected by cross validation, and the detailed description of parameters for other competitors are shown in Appendix 7.1. The parameters used are chosen by ﬁve fold cross-validation (see Appendix 7.1).
Hardware Specification	Yes	All the experiments are conducted on a machine with an Intel 2.83GHz CPU with 32G RAM.
Software Dependencies	No	The paper mentions solving linear SVM problems using LIBLINEAR [6], but it does not specify a version number for this software or any other key software components used in the experiments.
Experiment Setup	Yes	The parameters γ, C are selected by cross validation, and the detailed description of parameters for other competitors are shown in Appendix 7.1. The parameters used are chosen by ﬁve fold cross-validation (see Appendix 7.1). To control the prediction cost, for Nys, KNys, and DC-Pred++, we vary the number of landmark points, and for RKS and fastfood, we vary the number of random features.