PiFold: Toward effective and efficient protein inverse folding

Authors: Zhangyang Gao, Cheng Tan, Stan Z. Li

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments show that Pi Fold could achieve 51.66% recovery on CATH 4.2, while the inference speed is 70 times faster than the autoregressive competitors. In addition, Pi Fold achieves 58.72% and 60.42% recovery scores on TS50 and TS500, respectively. We conduct comprehensive ablation studies to reveal the role of different types of protein features and model designs, inspiring further simplification and improvement.
Researcher Affiliation Academia Zhangyang Gao , Cheng Tan , Stan Z. Li AI Lab, Research Center for Industries of the Future, Westlake University {gaozhangyang, tancheng,Stan.ZQ.Li}@westlake.edu.cn
Pseudocode No The paper describes the methodology in text and figures, but does not include structured pseudocode or an algorithm block.
Open Source Code Yes The Py Torch code is available at Git Hub.
Open Datasets Yes To answer these questions, we compare Pi Fold against recent strong baselines on the CATH (Orengo et al., 1997) dataset.
Dataset Splits Yes We use the same data splitting as Graph Trans (Ingraham et al., 2019) and GVP (Jing et al., 2020), where proteins are partitioned by the CATH topology classification, resulting in 18024 proteins for training, 608 proteins for validation, and 1120 proteins for testing.
Hardware Specification Yes The model is trained up to 100 epochs by default using the Adam optimizer on NVIDIA V100s.
Software Dependencies No The abstract mentions 'Py Torch code is available at Git Hub,' but it does not specify the version number for PyTorch or any other software dependencies required to replicate the experiments.
Experiment Setup Yes We stack ten layers of Pi GNN to construct the Pi Fold model with hidden dimension 128. The model is trained up to 100 epochs by default using the Adam optimizer on NVIDIA V100s. The batch size and learning rate used for training are 8 and 0.001, respectively.