Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
P(all-atom) Is Unlocking New Path For Protein Design
Authors: Wei Qu, Jiawei Guan, Rui Ma, Ke Zhai, Weikun Wu, Haobo Wang
ICML 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results demonstrate that Pallatom excels in key metrics of protein design, including designability, diversity, and novelty, showing significant improvements across the board. Our extensive experiments show that by learning P(all-atom), high-quality all-atom proteins can be successfully generated. |
| Researcher Affiliation | Collaboration | 1Institute of Science and Technology for Brain-Inspired Intelligence, Fudan University, Shanghai, China 2LEVINTHAL Biotechnology Co.Ltd, Hangzhou, China. Correspondence to: Weikun Wu <EMAIL>, Haobo Wang <EMAIL>. |
| Pseudocode | Yes | Algorithm 1 Pallatom Inference, Algorithm 2 Main Trunk, Algorithm 3 Template Embedder, Algorithm 4 Atom Feature Encoder, Algorithm 5 Atom Attention Decoder, Algorithm 6 Node Update, Algorithm 7 Pair Update, Algorithm 8 Smooth LDDT loss. |
| Open Source Code | Yes | Code Availibility The Pallatom is available on Git Hub (https://github. com/levinthal/Pallatom). |
| Open Datasets | Yes | The training dataset of the model includes the PDB (Zardecki et al., 2022) and Alpha Fold Database (AFDB) (Varadi et al., 2021). |
| Dataset Splits | No | The paper describes extensive data cleaning and filtering processes applied to PDB and AFDB datasets (Appendix B), resulting in a curated dataset of 27,697 protein structures. However, it does not explicitly provide specific train/validation/test splits (percentages, counts, or predefined splits) for reproducing experiments on this data. |
| Hardware Specification | Yes | Training time 10 days Device 4 A6000. All methods were tested on the same hardware: CPU: AMD EPYC 7402 @2.8GHz, GPU: NVIDIA Ge Force RTX 4090 with 24GB VRAM. |
| Software Dependencies | No | The paper mentions using the "Adam optimizer" and "JAX's JIT compilation" but does not specify version numbers for these or any other key software libraries or frameworks used in their implementation. |
| Experiment Setup | Yes | The model training utilized the Adam optimizer (Kingma & Ba, 2017) with a learning rate of 1e-3, β1 = 0.9, β2 = 0.999, and a batch size of 32. Table 6: Pallatom training hyperparameters provides detailed settings including loss weights, diffusion timesteps, noise schedule parameters, transformer dimensions, and number of decoder units. |