Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Omni-Mol: Multitask Molecular Model for Any-to-any Modalities
Authors: Chengxin Hu, Hao Li, Yihe Yuan, Zezheng Song, Chenyang Zhao, Haixin Wang
NeurIPS 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Comprehensive experiments on our datasets show that Omni-Mol achieves significant improvements across 13 tasks simultaneously, setting new state-of-the-art results among both finetuned opensource LLMs and in-context learned closed-source LLMs. Additionally, we observe that Omni-Mol scales effectively with increases in data volume and model size, indicating the model s tremendous potential under larger computational budgets. Furthermore, by analyzing the representations of models trained on progressively more tasks, we discover that the representations become increasingly similar as the number of tasks grows. |
| Researcher Affiliation | Academia | 1 National University of Singapore 2 Independent Researcher 3 University of Maryland, College Park 4University of California, Los Angeles {EMAIL, EMAIL} |
| Pseudocode | No | The paper describes the model architecture and training procedures using mathematical equations and textual descriptions, but it does not include explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | Git Hub: Omni-Mol-Code Hugging Face: Omni-Mol-Data&Weight |
| Open Datasets | Yes | We then collect a dataset encompassing over 16 tasks with more than 1.4 million samples, making it the largest molecular instruction-tuning dataset to date. (...) Our model achieves unified instruction tuning across 16 tasks and attains state-of-the-art performance on 13 of them. Extensive experiments further demonstrate the scalability and versatility of Omni-Mol. |
| Dataset Splits | Yes | Following [21], for the Forward Reaction Prediction task, we extract data from USPTO, and split the dataset into 124,384 training instances and 1,000 test instances. Partially following [11], for the Catalyst Prediction and Solvent Prediction tasks, we similarly extract data from USPTO, splitting the training/test sets into 10,079/1,015 and 67,099/7,793, respectively. |
| Hardware Specification | Yes | Accelerators. Training Omni-Mol costs 576 NVIDIA A100 80G GPU hours. |
| Software Dependencies | Yes | Software and Driver Versions. The experiments are conducted with the following key software Python 3.12.1 Pytorch 2.5.1 Transformers 4.45.2 CUDA 12.4 |
| Experiment Setup | Yes | For unified tuning, we train 15 epochs with GAL rank of 64. For separate tuning, model is trained for 10 epochs with the same GAL configuration. The learning rate is set to 8e-5 from the grid search for all experiments. For consistency, the random seed is set to 0. More details can be found in Appendix D. |