Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Maximum likelihood estimation in Gaussian process regression is ill-posed

Authors: Toni Karvonen, Chris J. Oates

JMLR 2023 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our attempts at proving any properties of λδ ML have been unsuccessful, and therefore here we limit ourselves to a simple numerical investigation, the results of which are depicted in Figure 4.
Researcher Affiliation	Academia	Toni Karvonen EMAIL Department of Mathematics and Statistics University of Helsinki PL 56 (Pietari Kalmin katu 5) 00014 Helsingin yliopisto, Finland Chris J. Oates EMAIL School of Mathematics, Statistics and Physics Newcastle University Newcastle upon Tyne, NE1 7RU, United Kingdom
Pseudocode	No	The paper describes methods and proofs mathematically. No sections or figures are explicitly labeled as "Pseudocode" or "Algorithm".
Open Source Code	No	The paper mentions existing Gaussian process software like "GPy, since 2012; Matthews et al., 2017)" in the context of common approaches, but does not state that the authors have released code for the methodology described in this paper.
Open Datasets	No	The paper focuses on theoretical analysis using abstract "noiseless training data set Y = (y1, . . . , yn) Rn, associated to a set X of distinct covariates x1, . . . , xn Rd". It does not refer to any specific publicly available datasets used for experimental evaluation.
Dataset Splits	No	The paper does not use specific publicly available datasets or describe any experimental splits for training, validation, or testing.
Hardware Specification	No	The paper includes a "numerical investigation" in Section 4.1 but does not provide any specific details about the hardware (e.g., CPU, GPU models) used for these computations. It only mentions "Minimisation was performed using grid search."
Software Dependencies	No	The paper mentions general Gaussian process software like GPy and GPflow as common tools in the field but does not provide specific version numbers for any software used in their own numerical investigation or for replicating their results.
Experiment Setup	Yes	The regularised maximum likelihood estimate λδ ML in (4.2) as a function of the regularisation parameter δ for four diﬀerent data vectors Y R3 (we set m 0) when X = {1, 1.2, 2.0} R and K is the Matérn kernel in (2.3) with parameters σ = 1 and ν = 3/2. Minimisation was performed using grid search.