Predicting Dose-Response Curves with Deep Neural Networks

Authors: Pedro Alonso Campana, Paul Prasse, Tobias Scheffer

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We develop a neural model that uses an embedding of the interaction between drug molecules and the tissue transcriptome to estimate the entire dose-response curve rather than a scalar aggregate. We find that, compared to the prior state of the art, this model excels at interpolating and extrapolating the inhibitory effect of untried concentrations. Unlike prevalent parametric models, it it able to accurately predict dose-response curves of drugs on cells with previously unseen transcriptomes as well as of previously untested drug molecules on established cell lines. Our implementation is available at https://github. com/alonsocampana/ARCANet. Section 5. Experiments.
Researcher Affiliation Academia Pedro A. Campana 1 Paul Prasse 1 Tobias Scheffer 1 1Department of Computer Science, University of Potsdam, Potsdam, Germany. Correspondence to: Pedro A. Campana <alonsocampana@uni-potsdam.de>.
Pseudocode No The paper includes architectural diagrams (Figure 1 and Figure 2) and describes the functional modules of ARCANet, but it does not provide any pseudocode or algorithm blocks.
Open Source Code Yes Our implementation is available at https://github. com/alonsocampana/ARCANet. The code used for downloading and preprocessing the data as well as for reproducing our experiments can be found in https://github.com/alonsocampana/ARCANet.
Open Datasets Yes We use the largest publicly available repositories of dose-response data; Table 1 summarizes the data set characteristics. The Genomics of Drug Sensitivity in Cancer project (GDSC) (Yang et al., 2012; Iorio et al., 2016)... Cancer Therapeutics Response Portal (CTRPv2) (Rees et al., 2015), and PRISM (Corsello et al., 2020)... NCI60 (Shoemaker, 2006)... All experiments performed are based on publicly available datasets.
Dataset Splits Yes We reserve the older GDSC1 data exclusively for prototyping and hyper-parameter tuning, using 90% of the cell lines for training and 10% for hyper-parameter tuning and the MSE for precision oncology as objective. We split each of these data sets separately for smoothing, interpolation, extrapolation, and precision oncology, as described in Section 5.3. We partition the available data into 10 folds, unless a data set has fewer points per curve for smoothing and interpolation, in which case we adjust the number of folds. We finally perform cross-validation over these folds, training each model for 100 epochs. For NCI60, we use a single three-way split into 80% of the drugs for training, 10% of drugs for hyper-parameter tuning, and 10% for evaluation; we limit the number of training epochs to 50...
Hardware Specification Yes All neural network models and baselines are trained, using the Py Torch Geometric library (Fey & Lenssen, 2019) and Py Torch (Paszke et al., 2019) on one NVIDIA A100-SXM440GB GPU.
Software Dependencies Yes All neural network models and baselines are trained, using the Py Torch Geometric library (Fey & Lenssen, 2019) and Py Torch (Paszke et al., 2019)... We optimize hyper-parameters using the Bayesian optimization with 100 configuration proposals and early stopping using the median stopping rule in the Optuna (Akiba et al., 2019) framework.
Experiment Setup Yes We optimize hyper-parameters using the Bayesian optimization with 100 configuration proposals and early stopping using the median stopping rule in the Optuna (Akiba et al., 2019) framework. Both the search space and the final configuration are shown in Table 2. We finally perform cross-validation over these folds, training each model for 100 epochs. For NCI60... we limit the number of training epochs to 50.