Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Ivanov-Regularised Least-Squares Estimators over Large RKHSs and Their Interpolation Spaces

Authors: Stephen Page, Steffen Grünewälder

JMLR 2019 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical We provide rates of convergence for the expected squared L2 error of our estimator under the weak assumption that the variance of the response variables is bounded and the unknown regression function lies in an interpolation space between L2 and the RKHS. We then obtain faster rates of convergence when the regression function is bounded by clipping the estimator. In fact, we attain the optimal rate of convergence. Furthermore, we provide a high-probability bound under the stronger assumption that the response variables have subgaussian errors and that the regression function lies in an interpolation space between L and the RKHS. Finally, we derive adaptive results for the settings in which the regression function is bounded.
Researcher Affiliation Academia Stephen Page EMAIL STOR-i Lancaster University Lancaster, LA1 4YF, United Kingdom Steffen Grünewalder EMAIL Department of Mathematics and Statistics Lancaster University Lancaster, LA1 4YF, United Kingdom
Pseudocode No We can calculate an alternative ν(r) without diagonalising K. Note that if µ(r) > 0, then (3) can be written as Y T(K + nµ(r)I) 1K(K + nµ(r)I) 1Y = r2. Since µ(r) is strictly decreasing for µ(r) > 0, we have r Y T(K + nεI) 1K(K + nεI) 1Y 1/2 if and only if µ(r) [0, ε], so in this case we set ν(r) = ε. Otherwise, µ(r) > ε and (4) can be written as µ(r) n 1(Y TKY )1/2r 1. The function Y T(K + nµI) 1K(K + nµI) 1Y of µ > 0 is continuous. Hence, we can calculate ν(r) using interval bisection on the interval with lower end point ε and upper end point equal to the right-hand side of (5).
Open Source Code No The paper does not contain any explicit statements or links indicating the release of open-source code for the described methodology.
Open Datasets No We now formally define our regression problem. For a topological space T, let B(T) be the Borel σ-algebra of T. Let (S, S) be a measurable space. Assume that (Xi, Yi) for 1 i n are (S R, S B(R))-valued random variables on the probability space (Ω, F, P), which are i.i.d. with Xi P and E(Y 2 i ) < , where E denotes integration with respect to P.
Dataset Splits No The paper is theoretical and does not describe experiments using specific datasets, therefore, there is no mention of training, validation, or test dataset splits.
Hardware Specification No The paper does not describe any computational experiments or their execution, and therefore does not provide any hardware specifications.
Software Dependencies No The paper focuses on theoretical analysis and does not describe any implementation details, thus no specific software dependencies or version numbers are mentioned.
Experiment Setup No The paper is theoretical, presenting mathematical analysis and proofs rather than experimental results. Consequently, there are no details provided regarding experimental setup, hyperparameters, or specific training configurations.