Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

LinkerNet: Fragment Poses and Linker Co-Design with 3D Equivariant Diffusion

Authors: Jiaqi Guan, Xingang Peng, PeiQi Jiang, Yunan Luo, Jian Peng, Jianzhu Ma

NeurIPS 2023 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Empirical studies on ZINC and PROTAC-DB datasets demonstrate that our model can generate chemically valid, synthetically-accessible, and low-energy molecules under both unconstrained and constrained generation settings.
Researcher Affiliation	Academia	Jiaqi Guan University of Illinois Urbana-Champaign EMAIL Xingang Peng Peking University EMAIL Peiqi Jiang Tsinghua University EMAIL Yunan Luo Georgia Institute of Technology EMAIL Jian Peng University of Illinois Urbana-Champaign EMAIL Jianzhu Ma Tsinghua University EMAIL
Pseudocode	Yes	Algorithm 1 Training Procedure of Linker Net ... Algorithm 2 Sampling Procedure of Linker Net
Open Source Code	Yes	Reproducibility Statements The model implementation, experimental data and model checkpoints can be found here: https://github.com/guanjq/Linker Net
Open Datasets	Yes	We use a subset of ZINC [43] for the unconstrained generation. ... For the constrained generation, we use PROTAC-DB [45], a database collecting PROTACs from the literature or calculated by programs.
Dataset Splits	Yes	We use the same procedure as [21] to create fragments-linker pairs and randomly split the dataset, which results in a training/validation/test set with 438,610 / 400 / 400 examples. For the constrained generation, we use PROTAC-DB [45], a database collecting PROTACs from the literature or calculated by programs. The same procedure is applied to obtain reference conformations and create data pairs. We select 10 different warheads as the test set (43 examples) and the remaining as the training set (992 examples).
Hardware Specification	Yes	We trained our model on one NVIDIA RTX A6000 GPU
Software Dependencies	No	The paper mentions "Adam W [30]" as an optimizer but does not provide specific software versions for libraries, frameworks, or programming languages (e.g., PyTorch version, Python version).
Experiment Setup	Yes	The model is trained via Adam W [30] with init_learning_rate=5e-4, betas=(0.99, 0.999), batch_size=64 and clip_gradient_norm=50.0. To balance the scales of different losses, we multiply a factor λ = 100 on the atom type loss and bond type loss. During the training phase, we add a small Gaussian noise with a standard deviation of 0.05 to linker atom coordinates as data augmentation. We also schedule to decay the learning rate exponentially with a factor of 0.6 and a minimum learning rate of 1e-6. The learning rate is decayed if there is no improvement for the validation loss in 10 consecutive evaluations. The evaluation is performed for every 2000 training steps.