Gaussian Quadrature for Kernel Features

Authors: Tri Dao, Christopher M. De Sa, Christopher Ré

NeurIPS 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We validate our methods on datasets in different domains, such as MNIST and TIMIT, showing that deterministic features are faster to generate and achieve accuracy comparable to the state-of-the-art kernel methods based on random Fourier features.
Researcher Affiliation Academia Tri Dao Department of Computer Science Stanford University Stanford, CA 94305 trid@stanford.edu Christopher De Sa Department of Computer Science Cornell University Ithaca, NY 14853 cdesa@cs.cornell.edu Christopher Ré Department of Computer Science Stanford University Stanford, CA 94305 chrismre@cs.stanford.edu
Pseudocode No The paper describes methods such as dense grid construction and sparse grid construction, and refers to 'Algorithm 4.1 from Holtz [12]', but it does not provide pseudocode directly within the paper.
Open Source Code No The paper does not provide an explicit statement about releasing source code for its methodology or a link to a code repository.
Open Datasets Yes To evaluate the performance of deterministic feature maps, we analyzed the accuracy of a sparse ANOVA kernel on the MNIST digit classification task [16] and the TIMIT speech recognition task [5].
Dataset Splits Yes This task consists of 70, 000 examples (60, 000 in the training dataset and 10, 000 in the test dataset) of hand-written digits which need to be classified.
Hardware Specification No The paper mentions a procedure run 'on CPU' and refers to 'architectures such as application-specific integrated circuits (ASICs)', but it does not provide specific CPU models, GPU models, or detailed hardware specifications used for the experiments.
Software Dependencies No The paper does not provide specific software dependency names with version numbers, such as Python or PyTorch versions.
Experiment Setup Yes Each data point corresponds to a frame (10ms) of audio data, preprocessed using the standard feature space Maximum Likelihood Linear Regression (f MMLR) [4]. The input x has dimension 40. After generating kernel features z(x) from this input, we model the corresponding phonemes y by a multinomial logistic regression model. Again, we use a sparse ANOVA kernel, which is a sum of 50 sub-kernels of the form exp( γ x S y S 2), each acting on a subset S of 5 indices. These subsets are randomly chosen a priori. To reweight the quadrature features, we sample 500 data points out of 1 million.