Generalist Equivariant Transformer Towards 3D Molecular Interaction Learning

Authors: Xiangzhe Kong, Wenbing Huang, Yang Liu

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section, we aim to answer the following three questions via empirical experiments: (1) Does modeling complexes with unified representation better captures the geometric interactions than treating each interacting entity independently with domain-specific representations ( 4.1)? (2) Is the proposed unified representation more expressive than vanilla single-level representations or pooling-based hierarchical methods ( 4.2)? (3) Can the proposed method generalize to different domains by learning the various underlying interaction physics ( 4.3)?
Researcher Affiliation Academia Xiangzhe Kong 1 Wenbing Huang 2 3 Yang Liu 1 4 1Dept. of Comp. Sci. & Tech., Institute for AI, BNRist Center, Tsinghua University 2Gaoling School of Artificial Intelligence, Renmin University of China 3Beijing Key Laboratory of Big Data Management and Analysis Methods 4Institute for AI Industry Research (AIR), Tsinghua University.
Pseudocode No The paper provides detailed mathematical equations and a schematic diagram (Figure 3), but it does not include structured pseudocode or clearly labeled algorithm blocks.
Open Source Code Yes Codes for our GET as well as the experiments are available at https://github.com/THUNLP-MT/GET.
Open Datasets Yes We follow Somnath et al. (2021); Wang et al. (2022a) to conduct experiments on the well-established PDBbind (Wang et al., 2004; Liu et al., 2015) and split the dataset (4,709 biomolecular complexes) according to sequence identity of the protein with 30% as the threshold.
Dataset Splits Yes A total of 4709 complexes are first filtered by resolution and then splitted into 3507, 466, 490 for training, validation, and testing (Somnath et al., 2021). We split these complexes into training set and validation set with a ratio of 9:1 with respect to the number of clusters.
Hardware Specification Yes We conduct experiments on 1 Ge Force RTX 2080 Ti GPU with 12G memory except the zero-shot evaluation on RNA/DNA-ligand affinity which needs 2 GPUs.
Software Dependencies No The paper mentions using PyTorch Geometric, Adam optimizer, and MMseqs2. However, it does not provide specific version numbers for these software components or other key dependencies like Python or PyTorch, which are necessary for full reproducibility.
Experiment Setup Yes We give the description of the hyperparameters in Table 9 and their values for each task in Table 10.