Transfer Learning with Affine Model Transformation

Authors: Shunya Minami, Kenji Fukumizu, Yoshihiro Hayashi, Ryo Yoshida

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Through several case studies, we demonstrate the practical benefits of modeling and estimating inter-domain commonality and domain-specific factors separately with the affine-type transfer models.
Researcher Affiliation Academia Shunya Minami The Institute of Statistical Mathematics mshunya@ism.ac.jp Kenji Fukumizu The Institute of Statistical Mathematics fukumizu@ism.ac.jp Yoshihiro Hayashi The Institute of Statistical Mathematics yhayashi@ism.ac.jp Ryo Yoshida The Institute of Statistical Mathematics yoshidar@ism.ac.jp
Pseudocode Yes Algorithm 1 Block relaxation algorithm [19]. Initialize: a0, b0 = 0, c0 = 0 repeat at+1 = arg mina F(a, bt, ct) bt+1 = arg minb F(at+1, b, ct) ct+1 = arg minc F(at+1, bt+1, c) until convergence
Open Source Code Yes The Python code is available at https://github.com/mshunya/Affine TL.
Open Datasets Yes We used the dataset from [8] that records SPS and LTC for 320 and 45 inorganic compounds, respectively. The input compounds were translated to 290-dimensional compositional descriptors using Xenon Py [6, 33, 34, 35]1. 1https://github.com/yoshida-lab/Xenon Py Experimental values of the specific heat capacity of the 70 polymers were collected from Po Ly Info [39].
Dataset Splits Yes The regularization parameter in the kernel ridge regression and λα, λβ, and λγ in the affine model transfer were selected through 5-fold cross-validation.
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU/CPU models, memory amounts) used for running its experiments, only general mentions of machine learning tasks.
Software Dependencies No The paper mentions several software tools and libraries (e.g., Adagrad, Adam, scikit-learn) but does not provide specific version numbers for them.
Experiment Setup Yes The scale parameter ℓwas set to the square root of the input dimension as ℓ= 21 for Direct, HTL-offset and HTL-scale, ℓ= 6 for Only source and ℓ= 27 for Augmented. The regularization parameter λ was selected in 5-fold cross-validation in which the grid search was performed over 50 grid points in the interval [10 4, 102]. Hyperparameters to be optimized are the three regularization parameters λ1, λ2 and λ3. We performed 5-fold cross-validation to identify the best hyperparameter set from the candidate points; {10 3, 10 2, 10 1, 1} for λ1 and {10 2, 10 1, 1, 10} for each of λ2 and λ3.