Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

P(all-atom) Is Unlocking New Path For Protein Design

Authors: Wei Qu, Jiawei Guan, Rui Ma, Ke Zhai, Weikun Wu, Haobo Wang

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results demonstrate that Pallatom excels in key metrics of protein design, including designability, diversity, and novelty, showing significant improvements across the board. Our extensive experiments show that by learning P(all-atom), high-quality all-atom proteins can be successfully generated.
Researcher Affiliation	Collaboration	1Institute of Science and Technology for Brain-Inspired Intelligence, Fudan University, Shanghai, China 2LEVINTHAL Biotechnology Co.Ltd, Hangzhou, China. Correspondence to: Weikun Wu <EMAIL>, Haobo Wang <EMAIL>.
Pseudocode	Yes	Algorithm 1 Pallatom Inference, Algorithm 2 Main Trunk, Algorithm 3 Template Embedder, Algorithm 4 Atom Feature Encoder, Algorithm 5 Atom Attention Decoder, Algorithm 6 Node Update, Algorithm 7 Pair Update, Algorithm 8 Smooth LDDT loss.
Open Source Code	Yes	Code Availibility The Pallatom is available on Git Hub (https://github. com/levinthal/Pallatom).
Open Datasets	Yes	The training dataset of the model includes the PDB (Zardecki et al., 2022) and Alpha Fold Database (AFDB) (Varadi et al., 2021).
Dataset Splits	No	The paper describes extensive data cleaning and filtering processes applied to PDB and AFDB datasets (Appendix B), resulting in a curated dataset of 27,697 protein structures. However, it does not explicitly provide specific train/validation/test splits (percentages, counts, or predefined splits) for reproducing experiments on this data.
Hardware Specification	Yes	Training time 10 days Device 4 A6000. All methods were tested on the same hardware: CPU: AMD EPYC 7402 @2.8GHz, GPU: NVIDIA Ge Force RTX 4090 with 24GB VRAM.
Software Dependencies	No	The paper mentions using the "Adam optimizer" and "JAX's JIT compilation" but does not specify version numbers for these or any other key software libraries or frameworks used in their implementation.
Experiment Setup	Yes	The model training utilized the Adam optimizer (Kingma & Ba, 2017) with a learning rate of 1e-3, β1 = 0.9, β2 = 0.999, and a batch size of 32. Table 6: Pallatom training hyperparameters provides detailed settings including loss weights, diffusion timesteps, noise schedule parameters, transformer dimensions, and number of decoder units.