reproducibilityindex.ai

Scalable Kernel Methods via Doubly Stochastic Gradients

Authors: Bo Dai, Bo Xie, Niao He, Yingyu Liang, Anant Raj, Maria-Florina F Balcan, Le Song

NeurIPS 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We demonstrate that a function learned by this procedure after t iterations converges to the optimal function in the reproducing kernel Hilbert space in rate O(1/t), and achieves a generalization bound of O(1/ t). Our approach can readily scale kernel methods up to the regimes which are dominated by neural nets. We show competitive performances of our approach as compared to neural nets in datasets such as 2.3 million energy materials from Molecular Space, 8 million handwritten digits from MNIST, and 1 million photos from Image Net using convolution features.
Researcher Affiliation	Academia	1Georgia Institute of Technology {bodai, bxie33, nhe6, araj34}@gatech.edu, lsong@cc.gatech.edu 2Princeton University 3Carnegie Mellon University yingyul@cs.princeton.edu ninamf@cs.cmu.edu
Pseudocode	Yes	Algorithm 1: {αi}t i=1 = Train(P(x, y)) Require: P(ω), φω(x), l(f(x), y), ν. 1: for i = 1, . . . , t do 2: Sample (xi, yi) P(x, y). 3: Sample ωi P(ω) with seed i. 4: f(xi) = Predict(xi, {αj}i 1 j=1). 5: αi = γil (f(xi), yi)φωi(xi). 6: αj = (1 γiν)αj for j = 1, . . . , i 1. 7: end for
Open Source Code	No	No explicit statement indicating the release of source code (e.g., 'We release our code...', 'The source code for our method is available at...') or a link to a code repository was found in the paper.
Open Datasets	Yes	Name Model # of samples Input dim Output range Virtual (1) Adult K-SVM 32K 123 { 1, 1} no (2) MNIST 8M 8 vs. 6 [25] K-SVM 1.6M 784 { 1, 1} yes (3) Forest K-SVM 0.5M 54 { 1, 1} no (4) MNIST 8M [25] K-logistic 8M 1568 {0, . . . , 9} yes (5) CIFAR 10 [26] K-logistic 60K 2304 {0, . . . , 9} yes (6) Image Net [27] K-logistic 1.3M 9216 {0, . . . , 999} yes (7) Quantum Machine [28] K-ridge 6K 276 [ 800, 2000] yes (8) Molecular Space [28] K-ridge 2.3M 2850 [0, 13] no
Dataset Splits	No	No explicit statement regarding training/validation/test dataset splits (e.g., percentages, sample counts, or citations to predefined splits) was found.
Hardware Specification	Yes	The algorithms are executed on the machine with AMD 16 2.4GHz Opteron CPUs and 200G memory.
Software Dependencies	No	No specific software dependencies with version numbers (e.g., 'Python 3.8', 'PyTorch 1.9') were mentioned in the paper.
Experiment Setup	Yes	For algorithms based on low rank kernel matrix approximation and random features, i.e., pegasos and SDCA, we set the rank and number of random features to be 28. We use same batch size for both our algorithm and the competitors. We stop algorithms when they pass through the entire dataset once.