Supervised Training of Conditional Monge Maps
Authors: Charlotte Bunne, Andreas Krause, Marco Cuturi
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate the ability of CONDOT to infer the effect of an arbitrary combination of genetic or therapeutic perturbations on single cells, using only observations of the effects of said perturbations separately. Our results demonstrate the ability of our architectures to better capture on out-of-sample observations the effects of these variables in various settings, even when considering never-seen, composite context labels. We consider various high-dimensional problems arising from this scenario to evaluate the performance of CONDOT ( 3) versus other baselines. |
| Researcher Affiliation | Collaboration | Charlotte Bunne ETH Zurich bunnec@ethz.ch Andreas Krause ETH Zurich krausea@ethz.ch Marco Cuturi Apple cuturi@apple.com |
| Pseudocode | Yes | Given a dataset D = {ci, (µi, i)}N i=0 of N pairs of populations before µi and after transport i connected to a context ci, we detail in Algorithm 1 provided in B, a training loop that incorporates all of the architecture proposals described above. |
| Open Source Code | No | No explicit statement or link providing access to the source code for the methodology described in this paper was found. |
| Open Datasets | Yes | We evaluate our method on the task of inferring single-cell perturbation responses to the cancer drug Givinostat... The dataset contains 3, 541 cells described with the gene expression levels of 1, 000 highly-variable genes. We analyze CONDOT s ability to accurately predict phenotypes of genetic perturbations based on single-cell RNA-sequencing pooled CRISPR screens (Norman et al., 2019; Dixit et al., 2016), comprising 98, 419 single-cell gene expression profiles with 92 different genetic perturbations, each cell measured via a 1, 500 highly-variable gene expression vector. |
| Dataset Splits | Yes | We split the dataset into train / test splits of increasing difficulty (details on the dataset splits in D.2). Table 1: Evaluation of drug effect predictions... Results are reported based on MMD and the 2 distance between perturbation signatures... In-Sample Out-of-Sample In-Sample. |
| Hardware Specification | No | No specific hardware details (e.g., GPU/CPU models, memory amounts) used for running the experiments were provided in the paper. |
| Software Dependencies | No | The paper mentions software like Optimal Transport Tools (OTT): A JAX Toolbox (Cuturi et al., 2022), SCANPY (Wolf et al., 2018), and scikit-learn (Pedregosa et al., 2011), but no specific version numbers for these dependencies are provided. |
| Experiment Setup | No | The paper describes initialization strategies for neural networks (Identity Initialization, Gaussian Initialization) and mentions the Adam optimizer (Kingma and Ba, 2014), but it does not provide specific hyperparameter values (e.g., learning rate, batch size, number of epochs) or detailed training configurations. |