MAGAN: Aligning Biological Manifolds

Authors: Matthew Amodio, Smita Krishnaswamy

ICML 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate applications of MAGAN in single-cell biology in integrating two different measurement types together: cells from the same tissue are measured with both genomic (single-cell RNAsequencing) and proteomic (mass cytometry) technologies. We show that MAGAN successfully aligns manifolds such that known correlations between measured markers are improved compared to other recently proposed models. The rest of this paper is organized as follows. First, there is a detailed description of the MAGAN architecture. Next, there is a validation of its performance on artificial data and the standard MNIST dataset. Then, there are demonstrations on three real-world biological applications: mapping between two replicate cytometry domains, mapping between two different cytometry domains, and mapping between one cytometry domain and a single-cell RNA sequencing domain.
Researcher Affiliation Academia 1Department of Computer Science, Yale University 2Department of Genetics, Yale University.
Pseudocode No The paper describes the architecture and loss functions mathematically, but it does not include any pseudocode or clearly labeled algorithm blocks.
Open Source Code No The paper does not provide any statement about making its source code available, nor does it provide a link to a code repository.
Open Datasets Yes Next we test a subset of the MNIST handwritten digit data by taking only 3 s and 7 s as the first domain X1, and a 120 degree rotation of each image as the second domain X2. Next we demonstrate MAGAN s ability to align two manifolds in domains whose dimensionality only partly overlap. To test this, we use the datasets from two experiments published in (Setty et al., 2016) where each experiment had a different panel that was run on samples from the same population of cells. To test MAGAN in this setting, we use a dataset consisting of 2830 measurements, where the dimensionality of each domain is 12 and 12496 for cytometry and sc RNA-seq, respectively (Velten et al., 2017).
Dataset Splits No The paper mentions that "Optimization was performed on 100,000 iterations of batches of size 256" and discusses cross-validation: "We perform cross-validation by repeating this test with each of the 16 shared markers in turn...". However, it does not specify explicit training/validation/test dataset splits with percentages, sample counts, or clear references to predefined splits for reproducibility beyond these statements.
Hardware Specification No The paper does not provide any specific details about the hardware used to run the experiments (e.g., GPU models, CPU types, or memory specifications).
Software Dependencies No The paper mentions general software components like "Leaky ReLU activations", "sigmoid", "linear" for output layers, "Dropout of 0.9", and "ADAM optimizer", but it does not specify version numbers for any programming languages, libraries, or frameworks (e.g., Python version, TensorFlow/PyTorch version).
Experiment Setup Yes All experiments were performed with the MAGAN framework with discriminators of five layers each and generators of three layers each. Layer sizes depended on the dataset, while Leaky Re LU activations were used on all layers except the output layers of the discriminators (which were sigmoid) and the generators (which were linear). Dropout of 0.9 was applied during training and for images convolutional layers were used. Optimization was performed on 100,000 iterations of batches of size 256 by the ADAM optimizer with learning rate 0.001.