Smoothness Adaptive Hypothesis Transfer Learning

Authors: Haotian Lin, Matthew Reimherr

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental This section aims to confirm our theoretical results in the target-only KRR and transfer learning sections. The theoretical findings and the effectiveness of SATL are confirmed by several experiments.
Researcher Affiliation Academia Haotian Lin 1 Matthew Reimherr 1 1Department of Statistics, Pennsylvania State University, University Park, PA, USA. Correspondence to: Haotian Lin <hzl435@psu.edu>.
Pseudocode Yes Algorithm 1 OTL-KRR
Open Source Code Yes The code to reproduce our experimental results is available at https://github.com/haotianlin/SATL.
Open Datasets Yes To generate such f0 with the desired Sobolev smoothness, we set f0 to be the sample path that is generated from the Gaussian process with isotropic Mat ern covariance kernels Kν (Stein, 1999). We set ν = 2.01 and 3.01 to generate the corresponding f0 with smoothness 2 and 3, see Corollary 4.15 in Kanagawa et al. (2018) for detail discussion about the connection between ν and α. We generate the target/source functions and the offset function as follows: (i) The target function f T is a sample path of the Gaussian process with Mat ern kernel K1.01 such that f T H1; (ii) The offset function fδ is a sample path of Gaussian process with Mat ern kernel Kν with ν = 2.01, 3.01, 4.01 such that the offset fδ belongs to H2, H3, H4 respectively.
Dataset Splits Yes To develop an adaptive procedure without known α0, we employ a standard training and validation approach (Steinwart & Christmann, 2008). To this end, we construct a finite set that is an arithmetic sequence...Split dataset D = {(xi, yi)}n i=1 into D1 := {(x1, y1), , (xj, yj)} D2 := {(xj+1, yj+1), , (xn, yn)}, and we set the candidate smoothness as [1, 2, 3, 4, 5] and split the dataset equally in size to implement training and validation.
Hardware Specification No The paper describes its experimental setup in Section 5 but does not specify any hardware details such as CPU, GPU models, or memory.
Software Dependencies No The paper does not provide specific software dependencies with version numbers.
Experiment Setup Yes For different α, we set λ = exp{ Cn 2 2α+1 } with a fixed C. To evaluate the adaptivity rate, we set the candidate smoothness as [1, 2, 3, 4, 5]... We try different values of C lies in the equally spaced sequence [0.05, 0.1, , 4], and report the optimal curve in Figure 2 under the best choice of C., and In our implementation, we consider the finite basis as (1) Fourier basis Bj(x) = √ 2cos(π k x) (which was used in Wang et al. (2016)) and (2) Bj being the j-th order B-spline. ...for each α, we determine the constant C in exp{−C n 1 2α+d } via following cross-validation (CV) approach.