Supervised Quantile Normalization for Low Rank Matrix Factorization

Authors: Marco Cuturi, Olivier Teboul, Jonathan Niles-Weed, Jean-Philippe Vert

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate the applicability of these techniques on synthetic and genomics datasets.
Researcher Affiliation Collaboration 1Google Research, Brain Team 2New York University.
Pseudocode Yes Algorithm 1 Sinkhorn with ℓiterations Inputs: a, b, x, y, ε, c Cxy Ð rcpxi, yjqsij, K Ð e Cxy{ε, u 1n for i 1, . . . , ℓdo vi Ð b{KT ui 1, ui Ð a{Kvi end Result: uℓ, vℓ, uℓ 1, K
Open Source Code No The paper does not provide concrete access to source code for the methodology described.
Open Datasets Yes As an illustration on real-world data we consider the problem of multiomics data integration, a domain where NMF has been shown to be a relevant approach to capture low-rank representations of cancer patients using multiple omics datasets (Chalise & Fridley, 2017). Following the recent benchmark of Cantini et al. (2020), we collected from The Cancer Genome Atlas (TCGA) three types of genomic data (gene expression, mi RNA expression and methylation) for thousands of cancer samples from 9 cancer types
Dataset Splits No The paper does not provide specific dataset split information (exact percentages, sample counts, or citations to predefined splits) needed to reproduce the data partitioning.
Hardware Specification No The paper does not provide specific hardware details (exact GPU/CPU models, processor types, or memory amounts) used for running its experiments.
Software Dependencies No The paper mentions "TensorFlow" but does not specify its version or other software dependencies with version numbers.
Experiment Setup Yes In all experiments reported here, we set ε and learning rates to 0.01. [...] For QMF, we set the number of target quantiles to m 16, and the regularization factor to 0.001. We train each model for 1,000 epochs, with a batch size of 64 and learning rate of 0.001.