High-Dimensional Kernel Methods under Covariate Shift: Data-Dependent Implicit Regularization
Authors: Yihang Chen, Fanghui Liu, Taiji Suzuki, Volkan Cevher
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | To quantitatively evaluate our derived error bounds for the bias and variance, we generate a synthetic dataset under a known fρ, with different decays of the kernel matrix. ... Fig. 1 (a)-(f) show the trends of the test risk, variance, and bias, which match our upper bound. |
| Researcher Affiliation | Academia | 1Laboratory for Information and Inference Systems, École Polytechnique Fédérale de Lausanne (EPFL), Switzerland. 2Department of Computer Science, University of Warwick, United Kingdom. 3Department of Mathematical Informatics, The University of Tokyo, Japan. 4Center for Advanced Intelligence Project, RIKEN, Tokyo, Japan. |
| Pseudocode | No | The paper does not contain a pseudocode or algorithm block. |
| Open Source Code | No | The paper does not provide a link to open-source code or explicitly state that code is available. |
| Open Datasets | No | To quantitatively evaluate our derived error bounds for the bias and variance, we generate a synthetic dataset under a known fρ, with different decays of the kernel matrix. ... The training samples xi are generated from xp,i = Σ1/2 p zi, and the test samples are generated from xq,i = Σ1/2 q zi. |
| Dataset Splits | No | The paper specifies the number of training and test data points but does not detail a validation split or cross-validation setup. |
| Hardware Specification | No | The paper does not explicitly describe the hardware used to run its experiments. |
| Software Dependencies | No | The paper does not provide specific version numbers for software dependencies. |
| Experiment Setup | Yes | We set the dimension d = 500, and the number of test data points to be 2500. We vary the number of training data points as (100, 200, 300, 400, 450, 480, 520, 550, 600, 700, 784, 900, 1000, 1200, 1500, 2000). We set the kernel K(x, x ) = (1 + x, x /d)p with p = 5, who admits β = p independent of Σp. We take the re-weighting function as the truncated probability ratio of distribution p and q, i.e., let q = q and truncate the ratios to 10. Finally, we run on 10 random seeds and calculate the mean and average. |