GIMLET: A Unified Graph-Text Model for Instruction-Based Molecule Zero-Shot Learning

Authors: Haiteng Zhao, Shengchao Liu, Ma Chang, Hannan Xu, Jie Fu, Zhihong Deng, Lingpeng Kong, Qi Liu

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results demonstrate that GIMLET significantly outperforms molecule-text baselines in instruction-based zero-shot learning, even achieving closed results to supervised GNN models on tasks such as toxcast and muv. In the experiments, we investigate the following inquiries: (i) Can GIMLET effectively handle zeroshot molecule property tasks by instructions? (ii) Can GIMLET performs better by few-shot learning? (iii) What impact does model architecture have on the performance of GIMLET? (iv) How does pretraining affect the performance of GIMLET? (v) How does the form of instruction influence GIMLET for molecule zero-shot learning?
Researcher Affiliation Academia Haiteng Zhao1, Shengchao Liu2, Chang Ma3, Hannan Xu4, Jie Fu5, Zhi-Hong Deng1, Lingpeng Kong3, Qi Liu3 1 Peking University 2 Mila 3 The University of Hong Kong 4 University of Oxford 5 Hong Kong University of Science and Technology
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks. It describes the model architecture and mathematical formulations but no step-by-step procedure in a pseudocode format.
Open Source Code Yes The code, model, and data are available at https://github.com/zhao-ht/GIMLET.
Open Datasets Yes To this end, we select Chembl [20] as the pretraining dataset, which is widely used for supervised graph pretraining [26, 62]... First, we include large-scale datasets PCBA [72]... We also target tasks from Molecule Net [75], a popular benchmark for molecule properties prediction... We construct a dataset consisting of more than two thousand molecule tasks with corresponding instructions derived from task descriptions. We pretrain GIMLET on the molecule tasks along with instructions, enabling the model to transfer effectively to a broad range of tasks.
Dataset Splits Yes Following the standard supervised setting in previous studies [26], we adopt the Scaffold split [51] with a ratio of 0.8, 0.1, 0.1 for all the datasets, and report results on the testing sets, ensuring the comparability of our results to previous works. We first split datasets into training, validation, and testing sets in the same as the zero-shot setting. Then K samples for each class are randomly sampled from the training set as the few-shot examples, where K is the few-shot number.
Hardware Specification No The paper does not provide specific hardware details (e.g., exact GPU/CPU models, processor types, or memory amounts) used for running its experiments. It does not mention any specific hardware setup.
Software Dependencies No The paper mentions using T5 [50] as the backbone language model, but it does not provide specific version numbers for T5 or any other key software components, libraries, or solvers required to replicate the experiments.
Experiment Setup No The paper mentions that 'The details of pretraining and downstream zero-shot testing are in Appendix.' and 'In both classification tasks and regression tasks, we fine-tune the last linear layer of all models using their respective modeling loss.' However, it does not explicitly provide concrete hyperparameter values (e.g., learning rate, batch size, number of epochs, optimizer settings) or detailed system-level training configurations in the main text or appendix.