Learning Kernels with Random Features
Authors: Aman Sinha, John C. Duchi
NeurIPS 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We experimentally evaluate our technique on several datasets. |
| Researcher Affiliation | Academia | Aman Sinha1 John Duchi1,2 Departments of 1Electrical Engineering and 2Statistics Stanford University {amans,jduchi}@stanford.edu |
| Pseudocode | Yes | Algorithm 1 Kernel optimization with f(t) = tk 1 as divergence |
| Open Source Code | No | The paper does not provide any explicit statement about releasing source code or a link to a code repository for the described methodology. |
| Open Datasets | Yes | For the adult2 dataset we employ the Gaussian kernel with a logistic regression model, and for the reuters3 dataset we employ a linear kernel with a ridge regression model. For the buzz4 dataset we employ ridge regression with an arc-cosine kernel of order 2, i.e. P0 = N(0, I) and φ(x, w) = H(w T x)(w T x)2, where H( ) is the Heavyside step function [7]. |
| Dataset Splits | No | The paper mentions training and test set sizes but does not explicitly provide details about a validation dataset split used in their experiments. |
| Hardware Specification | No | The paper does not provide specific hardware details such as exact GPU/CPU models, processor types, or memory amounts used for running its experiments. |
| Software Dependencies | No | The paper mentions machine learning models and algorithms used (e.g., logistic regression, ridge regression), but does not provide specific version numbers for any software dependencies or libraries. |
| Experiment Setup | Yes | For our approach, we use the χ2-divergence (k = 2 or f(t) = t2 1). Letting bq denote the solution to problem (4), we use two variants of our approach: when D < nnz(bq) we use estimator (5), and we use estimator (6) otherwise. For the original randomized feature approach, we relax the constraint in problem (7) with an ℓ2 penalty. Finally, for the joint optimization in which we learn the kernel and classifier together, we consider the kernel-learning objective... We use a standard primal-dual algorithm [4] to solve the min-max problem (9). |