Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Physics Aware Neural Networks for Unsupervised Binding Energy Prediction
Authors: Ke Liu, Hao Chen, Chunhua Shen
ICML 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments are conducted on the unsupervised protein-ligand binding energy prediction benchmarks, comparing them with previous works. Empirical results and theoretic analysis demonstrate that CEBind is more efficient and outperforms previous unsupervised models on benchmarks. |
| Researcher Affiliation | Academia | 1Zhejiang University, Hangzhou, China. Correspondence to: Hao Chen <EMAIL>. |
| Pseudocode | Yes | Algorithm 1 Training procedure (single data point) Algorithm 2 Training procedure of CEBind (single data point) Algorithm 3 Training procedure of DSMBind (single data point) |
| Open Source Code | No | The paper does not contain any explicit statements about releasing code, nor does it provide a link to a code repository. |
| Open Datasets | Yes | The protein-small molecule dataset contains 4806 protein-ligand complexes from PDBbind V2020 database for training (Su et al., 2018), 357 complexes randomly sampled from PDBbind in (St ark et al., 2022) for evaluation, and 258 complexes from the PDBbind core set with labels of binding energy (Su et al., 2018) for test. The antibody-antigen dataset includes 3416 antibody-antigen complexes from the structural antibody database (SAb Dab) (Schneider et al., 2022) for training, 116 complexes from CSM-sb (Myung et al., 2022) for evaluation, and 566 complexes with labels of binding affinity from SAb Dab for test. |
| Dataset Splits | Yes | The protein-small molecule dataset contains 4806 protein-ligand complexes from PDBbind V2020 database for training (Su et al., 2018), 357 complexes randomly sampled from PDBbind in (St ark et al., 2022) for evaluation, and 258 complexes from the PDBbind core set with labels of binding energy (Su et al., 2018) for test. The antibody-antigen dataset includes 3416 antibody-antigen complexes from the structural antibody database (SAb Dab) (Schneider et al., 2022) for training, 116 complexes from CSM-sb (Myung et al., 2022) for evaluation, and 566 complexes with labels of binding affinity from SAb Dab for test. |
| Hardware Specification | Yes | All our experiments are conducted on a computing cluster with 8 GPUs of NVIDIA Ge Force RTX 4090 24GB and CPUs of AMD EPYC 7763 64-Core of 3.52GHz. All the inferences are conducted on a single GPU of NVIDIA Ge Force RTX 4090 24GB. |
| Software Dependencies | Yes | We use the pre-trained ESM of version esm2 t36 3B UR50D for protein residue embedding. We use the SRU (Lei et al., 2017) as our protein-ligand interaction modeling model following DSMBind. |
| Experiment Setup | Yes | We train CEBind, Gauss DSMBind, and DSMBind for 10 epochs. We train all the models with the same hyperparameters following DSMBind (Jin et al., 2024). The batch size, learning rate, and hidden vector size are 4, 1e-3, and 256, respectively. We assign the duration of t as a random number from 0 to 1. |