Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Learning Kernels with Random Features
Authors: Aman Sinha, John C. Duchi
NeurIPS 2016 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We experimentally evaluate our technique on several datasets. |
| Researcher Affiliation | Academia | Aman Sinha1 John Duchi1,2 Departments of 1Electrical Engineering and 2Statistics Stanford University EMAIL |
| Pseudocode | Yes | Algorithm 1 Kernel optimization with f(t) = tk 1 as divergence |
| Open Source Code | No | The paper does not provide any explicit statement about releasing source code or a link to a code repository for the described methodology. |
| Open Datasets | Yes | For the adult2 dataset we employ the Gaussian kernel with a logistic regression model, and for the reuters3 dataset we employ a linear kernel with a ridge regression model. For the buzz4 dataset we employ ridge regression with an arc-cosine kernel of order 2, i.e. P0 = N(0, I) and ฯ(x, w) = H(w T x)(w T x)2, where H( ) is the Heavyside step function [7]. |
| Dataset Splits | No | The paper mentions training and test set sizes but does not explicitly provide details about a validation dataset split used in their experiments. |
| Hardware Specification | No | The paper does not provide specific hardware details such as exact GPU/CPU models, processor types, or memory amounts used for running its experiments. |
| Software Dependencies | No | The paper mentions machine learning models and algorithms used (e.g., logistic regression, ridge regression), but does not provide specific version numbers for any software dependencies or libraries. |
| Experiment Setup | Yes | For our approach, we use the ฯ2-divergence (k = 2 or f(t) = t2 1). Letting bq denote the solution to problem (4), we use two variants of our approach: when D < nnz(bq) we use estimator (5), and we use estimator (6) otherwise. For the original randomized feature approach, we relax the constraint in problem (7) with an โ2 penalty. Finally, for the joint optimization in which we learn the kernel and classi๏ฌer together, we consider the kernel-learning objective... We use a standard primal-dual algorithm [4] to solve the min-max problem (9). |