Learning transport cost from subset correspondence

Authors: Ruishan Liu, Akshay Balsubramani, James Zou

ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental On systematic experiments in images, marriage-matching and single-cell RNA-seq, our method substantially outperform state-of-the-art benchmarks.
Researcher Affiliation Academia Ruishan Liu Department of Electrical Engineering Stanford University Akshay Balsubramani Department of Genetics Stanford University James Zou Department of Biomedical Data Science Stanford University
Pseudocode Yes Algorithm 1 OT-SI
Open Source Code No The paper does not explicitly state that the source code for their methodology is publicly available, nor does it provide a link to a code repository for their method.
Open Datasets Yes We demonstrate the power of OT-SI in learning cost function that aligns two different data types RNA and protein expression in the CITE-seq cord blood mononuclear cells (CBMCs) experiments (Stoeckius et al., 2017). and MNIST dataset and Dutch Household Survey (DHS) dataset.
Dataset Splits Yes We use validation set for hyperparameter selection and early stopping. and In the experiment, we generate the training, test and validation datasets with 100, 100, and 50 samples of each moon. and The dataset is split into training, validation and test sets with ratio 50%, 20% and 30%.
Hardware Specification No The OT-SI algorithm is carried out in Pytorch (Paszke et al., 2017) and trained with GPU. No specific GPU model, CPU, or other hardware specifications are mentioned.
Software Dependencies No The OT-SI algorithm is carried out in Pytorch (Paszke et al., 2017) and trained with GPU. While a software dependency (Pytorch) is mentioned, no specific version number is provided.
Experiment Setup Yes We set the parameter λ = 103 and the number of Sinkhorn-Knopp iterations N = 200. The algorithm is run for 100 epochs with step size 1. and When OT-SI is learned in the original data space with the expression of 5001 m RNA and 13 proteins for each cell, we use a fully connected neural network to parametrize the cost function, with two hidden layers of 100 and 5 neurons.