reproducibilityindex.ai

PiFold: Toward effective and efficient protein inverse folding

Authors: Zhangyang Gao, Cheng Tan, Stan Z. Li

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments show that Pi Fold could achieve 51.66% recovery on CATH 4.2, while the inference speed is 70 times faster than the autoregressive competitors. In addition, Pi Fold achieves 58.72% and 60.42% recovery scores on TS50 and TS500, respectively. We conduct comprehensive ablation studies to reveal the role of different types of protein features and model designs, inspiring further simplification and improvement.
Researcher Affiliation	Academia	Zhangyang Gao , Cheng Tan , Stan Z. Li AI Lab, Research Center for Industries of the Future, Westlake University {gaozhangyang, tancheng,Stan.ZQ.Li}@westlake.edu.cn
Pseudocode	No	The paper describes the methodology in text and figures, but does not include structured pseudocode or an algorithm block.
Open Source Code	Yes	The Py Torch code is available at Git Hub.
Open Datasets	Yes	To answer these questions, we compare Pi Fold against recent strong baselines on the CATH (Orengo et al., 1997) dataset.
Dataset Splits	Yes	We use the same data splitting as Graph Trans (Ingraham et al., 2019) and GVP (Jing et al., 2020), where proteins are partitioned by the CATH topology classification, resulting in 18024 proteins for training, 608 proteins for validation, and 1120 proteins for testing.
Hardware Specification	Yes	The model is trained up to 100 epochs by default using the Adam optimizer on NVIDIA V100s.
Software Dependencies	No	The abstract mentions 'Py Torch code is available at Git Hub,' but it does not specify the version number for PyTorch or any other software dependencies required to replicate the experiments.
Experiment Setup	Yes	We stack ten layers of Pi GNN to construct the Pi Fold model with hidden dimension 128. The model is trained up to 100 epochs by default using the Adam optimizer on NVIDIA V100s. The batch size and learning rate used for training are 8 and 0.001, respectively.