Neural Design for Genetic Perturbation Experiments

Authors: Aldo Pacchiano, Drausin Wulsin, Robert A Barton, Luis Voloch

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conduct a series of experiments on synthetic and public data from the UCI Dua & Graff (2017) database and show that OAE is able to find the optimal arm" using fewer batch queries than other algorithms such as greedy and random sampling. Our experimental evaluation covers both neurally realizable and not neurally realizable function landscapes.
Researcher Affiliation Collaboration Aldo Pacchiano Microsoft Research NYC apacchiano@microsoft.com Drausin Wulsin & Robert A. Barton & Luis Voloch Immunai {drausin,robert.barton,luis}@immunai.com
Pseudocode Yes Algorithm 1 Optimistic Arm Elimination Principle (OAE) Algorithm 2 Optimistic Arm Elimination Dv D (OAE Dv D) Algorithm 3 Optimistic Arm Elimination Batch Sequential (OAE Seq) Algorithm 4 Noisy Batch Selection Principle (OAE)
Open Source Code Yes We test a Bayesian OAE algorithm against the Gene Disco benchmark Mehrjou et al. (2021), which assess the "Hit Rate" of different experimental planning algorithms over a number of pooled CRISPR experiments. We assess our performance against the other acquisition functions provided in the public implementation4 that select batches based solely on uncertainty considerations. 4https://github.com/genedisco/genedisco-starter
Open Datasets Yes We conduct a series of experiments on synthetic and public data from the UCI Dua & Graff (2017) database and show that OAE is able to find the optimal arm" using fewer batch queries than other algorithms such as greedy and random sampling.
Dataset Splits No The paper mentions data processing that splits into train, test, and validation sets for UCI datasets but does not provide specific split percentages or counts. "Due to our internal data processing that splits the data into train, test and validation sets the number of datapoints we consider may be different from the size of their public versions."
Hardware Specification No The paper does not specify any hardware details like GPU or CPU models used for the experiments.
Software Dependencies No The paper mentions using "Adam optimizer" and "Re LU activations" but does not specify software versions for libraries like PyTorch, TensorFlow, or Python itself.
Experiment Setup Yes In all these cases we use Re LU activation functions trained for 5000 steps via the Adam optimizer using batches of size 10. In our tests we use a batch size B = 3, a number of batches N = 150 and repeat each experiment a total of 25 times, reporting average results with standard error bars at each time step.