Learning Kernels with Random Features

Authors: Aman Sinha, John C. Duchi

NeurIPS 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We experimentally evaluate our technique on several datasets.
Researcher Affiliation Academia Aman Sinha1 John Duchi1,2 Departments of 1Electrical Engineering and 2Statistics Stanford University {amans,jduchi}@stanford.edu
Pseudocode Yes Algorithm 1 Kernel optimization with f(t) = tk 1 as divergence
Open Source Code No The paper does not provide any explicit statement about releasing source code or a link to a code repository for the described methodology.
Open Datasets Yes For the adult2 dataset we employ the Gaussian kernel with a logistic regression model, and for the reuters3 dataset we employ a linear kernel with a ridge regression model. For the buzz4 dataset we employ ridge regression with an arc-cosine kernel of order 2, i.e. P0 = N(0, I) and φ(x, w) = H(w T x)(w T x)2, where H( ) is the Heavyside step function [7].
Dataset Splits No The paper mentions training and test set sizes but does not explicitly provide details about a validation dataset split used in their experiments.
Hardware Specification No The paper does not provide specific hardware details such as exact GPU/CPU models, processor types, or memory amounts used for running its experiments.
Software Dependencies No The paper mentions machine learning models and algorithms used (e.g., logistic regression, ridge regression), but does not provide specific version numbers for any software dependencies or libraries.
Experiment Setup Yes For our approach, we use the χ2-divergence (k = 2 or f(t) = t2 1). Letting bq denote the solution to problem (4), we use two variants of our approach: when D < nnz(bq) we use estimator (5), and we use estimator (6) otherwise. For the original randomized feature approach, we relax the constraint in problem (7) with an ℓ2 penalty. Finally, for the joint optimization in which we learn the kernel and classifier together, we consider the kernel-learning objective... We use a standard primal-dual algorithm [4] to solve the min-max problem (9).