Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Group Ligands Docking to Protein Pockets

Authors: Jiaqi Guan, Jiahan Li, Xiangxin Zhou, Xingang Peng, Sheng Wang, Yunan Luo, Jian Peng, Jianzhu Ma

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Section 4 is titled "EXPERIMENTS" and includes subsections like "EXPERIMENTAL SETUP", "IMPROVING DOCKING PERFORMANCE WITH GROUPBIND", and "ABLATION STUDIES". It presents quantitative results in Table 1 and Table 2, discussing metrics like RMSD percentiles, and provides visual analyses in Figure 3 and Figure 4. This clearly indicates empirical studies with data analysis.
Researcher Affiliation	Collaboration	The authors are affiliated with: University of Illinois Urbana-Champaign, Tsinghua University, University of Chinese Academy of Sciences, Peking University, University of Washington, Georgia Institute of Technology (all academic institutions) and Helixon Research (a private research entity). The mix of university and private research affiliations indicates a collaboration.
Pseudocode	Yes	The paper includes "Algorithm 1 Procedure of preprocessing data for GROUPBIND" in Appendix B, which describes structured steps for data preprocessing.
Open Source Code	No	The paper does not contain any explicit statement about releasing source code for the methodology described, nor does it provide a direct link to a code repository.
Open Datasets	Yes	The paper states: "We evaluate our method on PDBBind v2020 dataset (Liu et al., 2015)". PDBBind is a well-known public dataset, and a citation is provided for its access.
Dataset Splits	Yes	The paper specifies: "We use the same time-based split following previous work (St ark et al., 2022; Lu et al., 2022; Corso et al., 2022), resulting in 17k complexes from 2018 or earlier for training/validation and 363 complexes from 2019 with no ligand overlap for testing."
Hardware Specification	Yes	The paper explicitly states: "We trained our model on eight NVIDIA A100 GPUs for 350 epochs." It also mentions: "In comparison, Diff Dock is trained on four 48GB RTX A6000 GPUs for 850 epochs."
Software Dependencies	No	The paper mentions using "e3nn library (Geiger & Smidt, 2022)" and "Adam W Loshchilov & Hutter (2017) optimizer" but does not provide specific version numbers for these or other software components. Without specific version numbers, the software dependencies are not reproducible.
Experiment Setup	Yes	The paper provides specific experimental setup details, including: "We set the pocket radius as 20 A", "We set the maximum number of ligands in the group... as 5", "We set a smaller sigma 10.0 for the translational noise", "distance loss is weighted by 0.1", "Adam W ... optimizer with the learning rate of 0.001 and the weight decay of 10.0", and details on model architecture like "5 layers with 48 scalar features and 10 vector features" or "24 scalar features and 6 vector features" for ablation studies, and "sample 40 poses per pocket".