A Divide-and-Conquer Solver for Kernel Support Vector Machines

Authors: Cho-Jui Hsieh, Si Si, Inderjit Dhillon

ICML 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental comparison with other state-of-the-art SVM solvers is shown in Section 5. We now compare our proposed algorithm with other SVM solvers. All the experiments are conducted on an Intel 2.66GHz CPU with 8G RAM. We use 7 benchmark datasets as shown in Table 3.
Researcher Affiliation Academia Department of Computer Science, The University of Texas, Austin, TX 78721, USA
Pseudocode Yes Algorithm 1 Divide and Conquer SVM
Open Source Code Yes The code for DC-SVM is available at http://www.cs. utexas.edu/ cjhsieh/dcsvm.
Open Datasets Yes The cifar dataset can be downloaded from http://www.cs.toronto. edu/ kriz/cifar.html, and other datasets can be downloaded from http://www.csie.ntu.edu.tw/ cjlin/libsvmtools/datasets or the UCI data repository.
Dataset Splits Yes We chose the balancing parameter C and kernel parameter γ by 5-fold cross validation on a grid of points... We use a random 80%-20% split for covtype, webspam, kddcup99, a random 8M/0.1M split for mnist8m (used in the original paper (Loosli et al., 2007)), and the original training/testing split for ijcnn1 and cifar.
Hardware Specification Yes All the experiments are conducted on an Intel 2.66GHz CPU with 8G RAM.
Software Dependencies No The paper mentions software like LIBSVM and LIBLINEAR, but does not specify their version numbers or other software dependencies with versions.
Experiment Setup Yes We chose the balancing parameter C and kernel parameter γ by 5-fold cross validation on a grid of points: C = [2 10, 2 9, . . . , 210] and γ = [2 10, . . . , 210] for ijcnn1, census, covtype, webspam, and kddcup99... Regarding the parameters for DC-SVM, we use 5 levels (lmax = 4) and k = 4, so the five levels have 1, 4, 16, 64 and 256 clusters respec- tively. For DC-SVM (early), we stop at the level with 64 clusters. The following are parameter settings for other methods in Table 2: the rank is set to be 3000 in LLSVM; number of Fourier features is 3000 in Fastfood; number of clusters is 3000 in LTPU; number of basis vectors is 200 in Sp SVM; the tolerance in the stopping condition for LIBSVM and DC-SVM is set to 10 3 (the default setting of LIBSVM); for La SVM we set the number of passes to be 1; for Cascade SVM we output the results after the first round.