Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Towards a Unified Analysis of Random Fourier Features

Authors: Zhu Li, Jean-Francois Ton, Dino Oglic, Dino Sejdinovic

JMLR 2021 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our empirical results illustrate the effectiveness of the proposed scheme relative to the standard random Fourier features method. ... In this section, we report the results of our numerical experiments (on both simulated and real-world datasets) aimed at validating the theoretical results and demonstrating the utility of Algorithm 1.
Researcher Affiliation	Collaboration	Zhu Li EMAIL Gatsby Computational Neuroscience Unit, University College London, London, UK Department of Statistics, University of Oxford, Oxford, UK Jean-Francois Ton EMAIL Department of Statistics, University of Oxford, Oxford, UK Dino Oglic EMAIL Astra Zeneca PLC, Cambridge, UK Department of Engineering, King s College London, London, UK Dino Sejdinovic EMAIL Department of Statistics, University of Oxford, Oxford, UK
Pseudocode	Yes	Algorithm 1 APPROXIMATE LEVERAGE WEIGHTED RFF Input: sample of examples {(xi, yi)}n i=1, shift-invariant kernel function k, and regularization parameter λ Output: set of features {(v1, p1), , (vm, pm)} with m and each pi computed as in lines 3 4 1: sample a pool of s random Fourier features {v1, . . . , vs} from p(v) 2: create a feature matrix Zs such that the i-th row of Zs is [z(v1, xi), , z(vs, xi)]T 3: associate with each feature vi a positive real number pi such that pi is equal to the i-th diagonal element ZT s Zs((1/s)ZT s Zs + nλI) 1 4: m Ps i=1 pi and M {(vi, pi/m)}s i=1 5: sample m features from set M using the multinomial distribution given by vector (p1/m, , ps/m)
Open Source Code	No	The paper mentions using third-party software: "We use the ridge regression and SVM package from Pedregosa et al. (2011) as a solver to perform our experiments." and "LIBSVM: A library for support vector machines. ... Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm." However, there is no explicit statement that the authors' own source code for the methodology described in this paper is released or publicly available.
Open Datasets	Yes	Next, we make a comparison between the performances of leverage weighted (computed according to Algorithm 1) and plain RFF on real-world datasets. In particular, we use four datasets from Chang and Lin (2011) and Dheeru and Karra Taniskidou (2017) for this purpose, including two for regression and two for classiﬁcation: CPU, KINEMATICS, COD-RNA and COVTYPE.
Dataset Splits	Yes	The Gaussian/RBF kernel is used for all the datasets with hyper-parameter tuning via 5-fold inner cross validation. We have repeated all the experiments 10 times and reported the average test error for each dataset.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., GPU/CPU models, memory, etc.) used for running its experiments. It only mentions using software packages.
Software Dependencies	No	The paper mentions using "the ridge regression and SVM package from Pedregosa et al. (2011)" (referring to scikit-learn) and "LIBSVM" without specifying any version numbers for these software dependencies. While LIBSVM has a link, the instruction requires version numbers for key components.
Experiment Setup	No	The paper states, "The Gaussian/RBF kernel is used for all the datasets with hyper-parameter tuning via 5-fold inner cross validation. We carefully cross-validate both methods across a grid of hyper-parameters and report the results in Table 10." However, it does not provide concrete hyperparameter values (e.g., specific learning rates, batch sizes, regularization strength for the kernel, or the grid used for cross-validation) in the main text.