Pin-Tuning: Parameter-Efficient In-Context Tuning for Few-Shot Molecular Property Prediction

Authors: Qiang Liu, Shaozhen Liu, Xin Sun, Shu Wu, Liang Wang

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental When evaluated on public datasets, our method demonstrates superior tuning with fewer trainable parameters, improving few-shot predictive performance.
Researcher Affiliation Academia 1New Laboratory of Pattern Recognition (NLPR) State Key Laboratory of Multimodal Artificial Intelligence Systems (MAIS) Institute of Automation, Chinese Academy of Sciences (CASIA) 2 School of Artificial Intelligence, University of Chinese Academy of Sciences 3 University of Science and Technology of China liang.wang@cripac.ia.ac.cn, qiang.liu@nlpr.ia.ac.cn, liushaozhen2025@ia.ac.cn sunxin000@mail.ustc.edu.cn, {shu.wu, wangliang}@nlpr.ia.ac.cn
Pseudocode Yes Appendix B: Pseudo-code of training process
Open Source Code Yes Code is available at: https://github.com/CRIPAC-DIG/Pin-Tuning
Open Datasets Yes We use five common few-shot molecular property prediction datasets from the Molecule Net [61]: Tox21, SIDER, MUV, Tox Cast, and PCBA.
Dataset Splits Yes The training set comprising multiple tasks {Ttrain}, is represented as Dtrain = {(mi, yi,t)|t {Ttrain}}... Correspondingly, the test set Dtest, formed by tasks {Ttest}... Episodic training... For each episode Et, a particular task Tt is selected from the training set, along with corresponding support set St and query set Qt... In the outer loop, the classification loss of query set is denoted as Lcls t,Q. Together with our Emb-BWC regularizer, the meta-training loss L(fθ ) is computed and we do an outer-loop optimization with learning rate αouter across the mini-batch: L(fθ ) = 1 B PB t=1 Lcls t,Q(fθ ) + λLEmb-BWC
Hardware Specification Yes Our experiments are conducted on Linux servers equipped with an AMD CPU EPYC 7742 (256) @ 2.250GHz, 256GB RAM and NVIDIA 3090 GPUs.
Software Dependencies Yes Our model is implemented in PyTorch version 1.12.1, PyTorch Geometric version 2.3.1 (https://pyg.org/) with CUDA version 11.3, RDKit version 2023.3.3 and Python 3.9.18.
Experiment Setup Yes Following previous works, we set d = 300. For MLPs in Eq. (2), we use the ReLU activation with d1 = 600. Pre-trained GIN model provided by Pre-GNN [20] is adopted as the PTME in our framework. We tune the weight of update constraint (i.e., λ) in {0.01, 0.1, 1, 10}, tune the learning rate of inner loop (i.e., αinner) in {1e-3, 5e-3, 1e-2,5e-2,1e-1, 5e-1, 1, 5}, and tune the learning rate of outer loop (i.e., αouter) in {1e-5, 1e-4, 1e-3,1e-2,1e-1}. Based on the results of hyperparameter tuning, we adopt αinner = 0.5, αouter = 1e-3 and d2 = 50. The Context Encoder() described in Section 4.2 is implemented using a 2-layer message passing neural network [11].