reproducibilityindex.ai

Gaussian Quadrature for Kernel Features

Authors: Tri Dao, Christopher M. De Sa, Christopher Ré

NeurIPS 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We validate our methods on datasets in different domains, such as MNIST and TIMIT, showing that deterministic features are faster to generate and achieve accuracy comparable to the state-of-the-art kernel methods based on random Fourier features.
Researcher Affiliation	Academia	Tri Dao Department of Computer Science Stanford University Stanford, CA 94305 trid@stanford.edu Christopher De Sa Department of Computer Science Cornell University Ithaca, NY 14853 cdesa@cs.cornell.edu Christopher Ré Department of Computer Science Stanford University Stanford, CA 94305 chrismre@cs.stanford.edu
Pseudocode	No	The paper describes methods such as dense grid construction and sparse grid construction, and refers to 'Algorithm 4.1 from Holtz [12]', but it does not provide pseudocode directly within the paper.
Open Source Code	No	The paper does not provide an explicit statement about releasing source code for its methodology or a link to a code repository.
Open Datasets	Yes	To evaluate the performance of deterministic feature maps, we analyzed the accuracy of a sparse ANOVA kernel on the MNIST digit classiﬁcation task [16] and the TIMIT speech recognition task [5].
Dataset Splits	Yes	This task consists of 70, 000 examples (60, 000 in the training dataset and 10, 000 in the test dataset) of hand-written digits which need to be classiﬁed.
Hardware Specification	No	The paper mentions a procedure run 'on CPU' and refers to 'architectures such as application-speciﬁc integrated circuits (ASICs)', but it does not provide specific CPU models, GPU models, or detailed hardware specifications used for the experiments.
Software Dependencies	No	The paper does not provide specific software dependency names with version numbers, such as Python or PyTorch versions.
Experiment Setup	Yes	Each data point corresponds to a frame (10ms) of audio data, preprocessed using the standard feature space Maximum Likelihood Linear Regression (f MMLR) [4]. The input x has dimension 40. After generating kernel features z(x) from this input, we model the corresponding phonemes y by a multinomial logistic regression model. Again, we use a sparse ANOVA kernel, which is a sum of 50 sub-kernels of the form exp( γ x S y S 2), each acting on a subset S of 5 indices. These subsets are randomly chosen a priori. To reweight the quadrature features, we sample 500 data points out of 1 million.