Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
MOL-Mamba: Enhancing Molecular Representation with Structural & Electronic Insights
Authors: Jingjing Hu, Dan Guo, Zhan Si, Deguang Liu, Yunfeng Diao, Jing Zhang, Jinxing Zhou, Meng Wang
AAAI 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments demonstrate that MOL-Mamba outperforms state-of-the-art baselines across eleven chemical-biological molecular datasets. In this section, we conduct comprehensive experiments to demonstrate the effcacy of our proposed method. |
| Researcher Affiliation | Academia | 1School of Computer Science and Information Engineering, Hefei University of Technology 2Institute of Artificial Intelligence, Hefei Comprehensive National Science Center 3Department of Chemistry and Centre for Atomic Engineering of Advanced Materials, Anhui University 4Department of Applied Chemistry, University of Science and Technology of China |
| Pseudocode | Yes | Algorithm 1: Mamba block with Graph SSM |
| Open Source Code | Yes | Code https://github.com/xian-sh/MOL-Mamba |
| Open Datasets | Yes | We use the recently popular GEOM (Axelrod and Gomez-Bombarelli 2022) that contains 50k qualified molecules, for molecular pretraining, followed by (Liu et al. 2022; Wang et al. 2023). For downstream tasks, we conduct experiments on 11 benchmark datasets from the Molecule Net (Wu et al. 2018), they involve physical chemistry, biophysics, physiology and quantum mechanics. |
| Dataset Splits | Yes | Each dataset uses the recommended splitting method to divide data into training/validation/test sets with a ratio of 8:1:1. |
| Hardware Specification | Yes | We develop all codes on a single NVIDIA RTX A5000 GPU. |
| Software Dependencies | No | The paper mentions 'Chem Des package (Dong et al. 2015)', '6-layer GIN (Xu et al. 2019)', and '6-layer Sch Net (Sch utt et al. 2017)' but does not provide specific version numbers for these or other software dependencies like Python, PyTorch, or CUDA. |
| Experiment Setup | Yes | For pretraining, we set the temperature coefficient in Eq. 6 τ = 0.5 , and we set the mask ratio as α = 10 (%) for mask matrix M in Eq. 10. Based on the order of magnitude of each loss, we set different loss weights as follows, Ld = Ls = Lmask = 0.1, Lf = 20.0, respectively. For pretraining and fine-tuning, we employ the Adam W optimizer, the learning rate is set to 0.0001, and the batch size is 64, and the training is conducted 100 epochs, with the early stopping on the validation set. |