reproducibilityindex.ai

Learning transport cost from subset correspondence

Authors: Ruishan Liu, Akshay Balsubramani, James Zou

ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	On systematic experiments in images, marriage-matching and single-cell RNA-seq, our method substantially outperform state-of-the-art benchmarks.
Researcher Affiliation	Academia	Ruishan Liu Department of Electrical Engineering Stanford University Akshay Balsubramani Department of Genetics Stanford University James Zou Department of Biomedical Data Science Stanford University
Pseudocode	Yes	Algorithm 1 OT-SI
Open Source Code	No	The paper does not explicitly state that the source code for their methodology is publicly available, nor does it provide a link to a code repository for their method.
Open Datasets	Yes	We demonstrate the power of OT-SI in learning cost function that aligns two different data types RNA and protein expression in the CITE-seq cord blood mononuclear cells (CBMCs) experiments (Stoeckius et al., 2017). and MNIST dataset and Dutch Household Survey (DHS) dataset.
Dataset Splits	Yes	We use validation set for hyperparameter selection and early stopping. and In the experiment, we generate the training, test and validation datasets with 100, 100, and 50 samples of each moon. and The dataset is split into training, validation and test sets with ratio 50%, 20% and 30%.
Hardware Specification	No	The OT-SI algorithm is carried out in Pytorch (Paszke et al., 2017) and trained with GPU. No specific GPU model, CPU, or other hardware specifications are mentioned.
Software Dependencies	No	The OT-SI algorithm is carried out in Pytorch (Paszke et al., 2017) and trained with GPU. While a software dependency (Pytorch) is mentioned, no specific version number is provided.
Experiment Setup	Yes	We set the parameter λ = 103 and the number of Sinkhorn-Knopp iterations N = 200. The algorithm is run for 100 epochs with step size 1. and When OT-SI is learned in the original data space with the expression of 5001 m RNA and 13 proteins for each cell, we use a fully connected neural network to parametrize the cost function, with two hidden layers of 100 and 5 neurons.