reproducibilityindex.ai

High-Dimensional Kernel Methods under Covariate Shift: Data-Dependent Implicit Regularization

Authors: Yihang Chen, Fanghui Liu, Taiji Suzuki, Volkan Cevher

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	To quantitatively evaluate our derived error bounds for the bias and variance, we generate a synthetic dataset under a known fρ, with different decays of the kernel matrix. ... Fig. 1 (a)-(f) show the trends of the test risk, variance, and bias, which match our upper bound.
Researcher Affiliation	Academia	1Laboratory for Information and Inference Systems, École Polytechnique Fédérale de Lausanne (EPFL), Switzerland. 2Department of Computer Science, University of Warwick, United Kingdom. 3Department of Mathematical Informatics, The University of Tokyo, Japan. 4Center for Advanced Intelligence Project, RIKEN, Tokyo, Japan.
Pseudocode	No	The paper does not contain a pseudocode or algorithm block.
Open Source Code	No	The paper does not provide a link to open-source code or explicitly state that code is available.
Open Datasets	No	To quantitatively evaluate our derived error bounds for the bias and variance, we generate a synthetic dataset under a known fρ, with different decays of the kernel matrix. ... The training samples xi are generated from xp,i = Σ1/2 p zi, and the test samples are generated from xq,i = Σ1/2 q zi.
Dataset Splits	No	The paper specifies the number of training and test data points but does not detail a validation split or cross-validation setup.
Hardware Specification	No	The paper does not explicitly describe the hardware used to run its experiments.
Software Dependencies	No	The paper does not provide specific version numbers for software dependencies.
Experiment Setup	Yes	We set the dimension d = 500, and the number of test data points to be 2500. We vary the number of training data points as (100, 200, 300, 400, 450, 480, 520, 550, 600, 700, 784, 900, 1000, 1200, 1500, 2000). We set the kernel K(x, x ) = (1 + x, x /d)p with p = 5, who admits β = p independent of Σp. We take the re-weighting function as the truncated probability ratio of distribution p and q, i.e., let q = q and truncate the ratios to 10. Finally, we run on 10 random seeds and calculate the mean and average.