reproducibilityindex.ai

Orthogonal Random Features

Authors: Felix Xinnan X. Yu, Ananda Theertha Suresh, Krzysztof M. Choromanski, Daniel N. Holtmann-Rice, Sanjiv Kumar

NeurIPS 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments on several datasets verify the effectiveness of ORF and SORF over the existing methods.
Researcher Affiliation	Industry	Google Research, New York {felixyu, theertha, kchoro, dhr, sanjivk}@google.com
Pseudocode	No	The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide any explicit statement or link regarding the release of open-source code for the described methodology.
Open Datasets	Yes	We first show kernel approximation performance on six datasets. The input feature dimension d is set to be power of 2 by padding zeros or subsampling. Figure 4 compares the mean squared error (MSE) of all methods. For fixed D, the kernel approximation MSE exhibits the following ordering: SORF ' ORF < QMC [25] < RFF [19] < Other fast kernel approximations [13, 28]. We also apply ORF and SORF on classiﬁcation tasks. Table 2 shows classiﬁcation accuracy for different kernel approximation techniques with a (linear) SVM classiﬁer. Datasets mentioned include LETTER, FOREST, USPS, CIFAR, MNIST, GISETTE.
Dataset Splits	No	The paper mentions using datasets for classification tasks but does not provide specific details on training, validation, or test set splits, nor does it specify cross-validation methods.
Hardware Specification	No	The paper does not provide any specific details about the hardware (e.g., GPU/CPU models, memory specifications) used for running its experiments.
Software Dependencies	No	The paper does not specify any software dependencies with version numbers (e.g., programming languages, libraries, or frameworks) required to replicate the experiments.
Experiment Setup	Yes	For each dataset, σ is chosen to be the mean distance of the 50th 2 nearest neighbor for 1,000 sampled datapoints. Empirically, this yields good classiﬁcation results. The role of σ: Note that a very small σ will lead to overﬁtting, and a very large σ provides no discriminative power for classiﬁcation. Throughout the experiments, σ for each dataset is chosen to be the mean distance of the 50th 2 nearest neighbor, which empirically yields good classiﬁcation results [28].