Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Orthogonal Random Features
Authors: Felix Xinnan X. Yu, Ananda Theertha Suresh, Krzysztof M. Choromanski, Daniel N. Holtmann-Rice, Sanjiv Kumar
NeurIPS 2016 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on several datasets verify the effectiveness of ORF and SORF over the existing methods. |
| Researcher Affiliation | Industry | Google Research, New York EMAIL |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide any explicit statement or link regarding the release of open-source code for the described methodology. |
| Open Datasets | Yes | We first show kernel approximation performance on six datasets. The input feature dimension d is set to be power of 2 by padding zeros or subsampling. Figure 4 compares the mean squared error (MSE) of all methods. For fixed D, the kernel approximation MSE exhibits the following ordering: SORF ' ORF < QMC [25] < RFF [19] < Other fast kernel approximations [13, 28]. We also apply ORF and SORF on classification tasks. Table 2 shows classification accuracy for different kernel approximation techniques with a (linear) SVM classifier. Datasets mentioned include LETTER, FOREST, USPS, CIFAR, MNIST, GISETTE. |
| Dataset Splits | No | The paper mentions using datasets for classification tasks but does not provide specific details on training, validation, or test set splits, nor does it specify cross-validation methods. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware (e.g., GPU/CPU models, memory specifications) used for running its experiments. |
| Software Dependencies | No | The paper does not specify any software dependencies with version numbers (e.g., programming languages, libraries, or frameworks) required to replicate the experiments. |
| Experiment Setup | Yes | For each dataset, σ is chosen to be the mean distance of the 50th 2 nearest neighbor for 1,000 sampled datapoints. Empirically, this yields good classification results. The role of σ: Note that a very small σ will lead to overfitting, and a very large σ provides no discriminative power for classification. Throughout the experiments, σ for each dataset is chosen to be the mean distance of the 50th 2 nearest neighbor, which empirically yields good classification results [28]. |