Generalist Equivariant Transformer Towards 3D Molecular Interaction Learning
Authors: Xiangzhe Kong, Wenbing Huang, Yang Liu
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we aim to answer the following three questions via empirical experiments: (1) Does modeling complexes with unified representation better captures the geometric interactions than treating each interacting entity independently with domain-specific representations ( 4.1)? (2) Is the proposed unified representation more expressive than vanilla single-level representations or pooling-based hierarchical methods ( 4.2)? (3) Can the proposed method generalize to different domains by learning the various underlying interaction physics ( 4.3)? |
| Researcher Affiliation | Academia | Xiangzhe Kong 1 Wenbing Huang 2 3 Yang Liu 1 4 1Dept. of Comp. Sci. & Tech., Institute for AI, BNRist Center, Tsinghua University 2Gaoling School of Artificial Intelligence, Renmin University of China 3Beijing Key Laboratory of Big Data Management and Analysis Methods 4Institute for AI Industry Research (AIR), Tsinghua University. |
| Pseudocode | No | The paper provides detailed mathematical equations and a schematic diagram (Figure 3), but it does not include structured pseudocode or clearly labeled algorithm blocks. |
| Open Source Code | Yes | Codes for our GET as well as the experiments are available at https://github.com/THUNLP-MT/GET. |
| Open Datasets | Yes | We follow Somnath et al. (2021); Wang et al. (2022a) to conduct experiments on the well-established PDBbind (Wang et al., 2004; Liu et al., 2015) and split the dataset (4,709 biomolecular complexes) according to sequence identity of the protein with 30% as the threshold. |
| Dataset Splits | Yes | A total of 4709 complexes are first filtered by resolution and then splitted into 3507, 466, 490 for training, validation, and testing (Somnath et al., 2021). We split these complexes into training set and validation set with a ratio of 9:1 with respect to the number of clusters. |
| Hardware Specification | Yes | We conduct experiments on 1 Ge Force RTX 2080 Ti GPU with 12G memory except the zero-shot evaluation on RNA/DNA-ligand affinity which needs 2 GPUs. |
| Software Dependencies | No | The paper mentions using PyTorch Geometric, Adam optimizer, and MMseqs2. However, it does not provide specific version numbers for these software components or other key dependencies like Python or PyTorch, which are necessary for full reproducibility. |
| Experiment Setup | Yes | We give the description of the hyperparameters in Table 9 and their values for each task in Table 10. |