Pin-Tuning: Parameter-Efficient In-Context Tuning for Few-Shot Molecular Property Prediction
Authors: Qiang Liu, Shaozhen Liu, Xin Sun, Shu Wu, Liang Wang
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | When evaluated on public datasets, our method demonstrates superior tuning with fewer trainable parameters, improving few-shot predictive performance. |
| Researcher Affiliation | Academia | 1New Laboratory of Pattern Recognition (NLPR) State Key Laboratory of Multimodal Artificial Intelligence Systems (MAIS) Institute of Automation, Chinese Academy of Sciences (CASIA) 2 School of Artificial Intelligence, University of Chinese Academy of Sciences 3 University of Science and Technology of China liang.wang@cripac.ia.ac.cn, qiang.liu@nlpr.ia.ac.cn, liushaozhen2025@ia.ac.cn sunxin000@mail.ustc.edu.cn, {shu.wu, wangliang}@nlpr.ia.ac.cn |
| Pseudocode | Yes | Appendix B: Pseudo-code of training process |
| Open Source Code | Yes | Code is available at: https://github.com/CRIPAC-DIG/Pin-Tuning |
| Open Datasets | Yes | We use five common few-shot molecular property prediction datasets from the Molecule Net [61]: Tox21, SIDER, MUV, Tox Cast, and PCBA. |
| Dataset Splits | Yes | The training set comprising multiple tasks {Ttrain}, is represented as Dtrain = {(mi, yi,t)|t {Ttrain}}... Correspondingly, the test set Dtest, formed by tasks {Ttest}... Episodic training... For each episode Et, a particular task Tt is selected from the training set, along with corresponding support set St and query set Qt... In the outer loop, the classification loss of query set is denoted as Lcls t,Q. Together with our Emb-BWC regularizer, the meta-training loss L(fθ ) is computed and we do an outer-loop optimization with learning rate αouter across the mini-batch: L(fθ ) = 1 B PB t=1 Lcls t,Q(fθ ) + λLEmb-BWC |
| Hardware Specification | Yes | Our experiments are conducted on Linux servers equipped with an AMD CPU EPYC 7742 (256) @ 2.250GHz, 256GB RAM and NVIDIA 3090 GPUs. |
| Software Dependencies | Yes | Our model is implemented in PyTorch version 1.12.1, PyTorch Geometric version 2.3.1 (https://pyg.org/) with CUDA version 11.3, RDKit version 2023.3.3 and Python 3.9.18. |
| Experiment Setup | Yes | Following previous works, we set d = 300. For MLPs in Eq. (2), we use the ReLU activation with d1 = 600. Pre-trained GIN model provided by Pre-GNN [20] is adopted as the PTME in our framework. We tune the weight of update constraint (i.e., λ) in {0.01, 0.1, 1, 10}, tune the learning rate of inner loop (i.e., αinner) in {1e-3, 5e-3, 1e-2,5e-2,1e-1, 5e-1, 1, 5}, and tune the learning rate of outer loop (i.e., αouter) in {1e-5, 1e-4, 1e-3,1e-2,1e-1}. Based on the results of hyperparameter tuning, we adopt αinner = 0.5, αouter = 1e-3 and d2 = 50. The Context Encoder() described in Section 4.2 is implemented using a 2-layer message passing neural network [11]. |