Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

KinDEL: DNA-Encoded Library Dataset for Kinase Inhibitors

Authors: Benson Chen, Tomasz Danel, Gabriel H. S. Dreiman, Patrick J. Mcenaney, Nikhil Jain, Kirill Novikov, Spurti Umesh Akki, Joshua L. Turnbull, Virja Atul Pandya, Boris P. Belotserkovskii, Jared Bryce Weaver, Ankita Biswas, Dat Nguyen, Kent Gorday, Mohammad Sultan, Nathaniel Stanley, Daniel M Whalen, Divya Kanichar, Christoph Klein, Emily Fox, R. Edward Watts

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	To address this gap, we introduce Kin DEL, one of the largest publicly accessible DEL datasets and the first one that includes binding poses from molecular docking experiments. Focused on two kinases, Mitogen-Activated Protein Kinase 14 (MAPK14) and Discoidin Domain Receptor Tyrosine Kinase 1 (DDR1), Kin DEL includes 81 million compounds, offering a rich resource for computational exploration. Additionally, we provide comprehensive biophysical assay validation data, encompassing both on-DNA and off-DNA measurements, which we use to evaluate a suite of machine learning techniques, including novel structure-based probabilistic models. We hope that our benchmark, encompassing both 2D and 3D structures, will help advance the development of machine learning models for data-driven hit identification using DELs.
Researcher Affiliation	Industry	1Insitro, South San Francisco, CA 94080, USA. Correspondence to: Benson Chen <EMAIL>, Tomasz Danel <EMAIL>.
Pseudocode	No	The paper describes various machine learning models (Random Forest, XGBoost, k-Nearest Neighbors, Deep Neural Network, GIN, Chemprop, DEL-Compose) and their architectures but does not include explicit pseudocode or algorithm blocks for any of them.
Open Source Code	Yes	Data and code for our benchmarks can be found at https://github.com/insitro/kindel.
Open Datasets	Yes	To demonstrate the advantages of DEL data and promote development of the methods described above, we release Kin DEL (Kinase Inhibitor DNA-Encoded Library) as library of 81 million small molecules tested against two kinase targets, MAPK14 and DDR1. ... Data and code for our benchmarks can be found at https://github.com/insitro/kindel.
Dataset Splits	Yes	We split our datasets using three strategies, ensuring that all held-out compounds are placed in the test set and not used for training. The first type of data split is a random split, where a randomly selected 10% of compounds are placed in the validation set, and another randomly selected 10% are placed in the test set. The second type of data split is a disynthon split, where we sample disynthon structures (molecules with the same 2 synthons), and put all compounds containing these sampled structures in the same subset using the same 80-10-10 ratio between the training, validation, and test sets. The third approach is a cluster split based on compound similarity.
Hardware Specification	Yes	Then HDBSCAN (Mc Innes et al., 2017) is used to cluster compounds, in a GPU implementation from NVIDIA (cuml) this runs in a few hours on a single Tesla T4.
Software Dependencies	Yes	Tautomer and protonation state selection was performed using Epik Classic (Shelley et al., 2007) from Schrödinger Suite 2024-4 at pH 7.4. For each library member, a random subset of stereoisomers was enumerated using rdkit 2023.09.2 ... Docking was performed using the Vina scoring function (Trott & Olson, 2010) in Uni-Dock 1.1.2 (Yu et al., 2023) ... All receptors were prepared using the Protein Preparation Wizard from Schrödinger Suite 2023-4...
Experiment Setup	Yes	Random Forest and XGBoost used 100 decision trees trained with the squared error criterion, and the depth of decision trees was not restricted. The k-Nearest Neighbors model used 5 nearest neighbors. The final architecture of the DNN model consisted of 5 linear layers with the ReLU activation functions except for the last one. Batch norm and dropout layers (with the probability of zeroing an element equal to 20%) were applied after each layer before the activation layers. The hidden dimension size was set to 512 for all layers. The GIN model has 5 GIN convolutional layers with hidden dimension size equal to 256. ... Chemprop uses three layers of bond message passing with the hidden dimension of 300. ... The DEL-Compose model ... The learning rate used to train DEL-Compose was 5e-5, and the batch size was 64.