Supervised Quantile Normalization for Low Rank Matrix Factorization
Authors: Marco Cuturi, Olivier Teboul, Jonathan Niles-Weed, Jean-Philippe Vert
ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate the applicability of these techniques on synthetic and genomics datasets. |
| Researcher Affiliation | Collaboration | 1Google Research, Brain Team 2New York University. |
| Pseudocode | Yes | Algorithm 1 Sinkhorn with ℓiterations Inputs: a, b, x, y, ε, c Cxy Ð rcpxi, yjqsij, K Ð e Cxy{ε, u 1n for i 1, . . . , ℓdo vi Ð b{KT ui 1, ui Ð a{Kvi end Result: uℓ, vℓ, uℓ 1, K |
| Open Source Code | No | The paper does not provide concrete access to source code for the methodology described. |
| Open Datasets | Yes | As an illustration on real-world data we consider the problem of multiomics data integration, a domain where NMF has been shown to be a relevant approach to capture low-rank representations of cancer patients using multiple omics datasets (Chalise & Fridley, 2017). Following the recent benchmark of Cantini et al. (2020), we collected from The Cancer Genome Atlas (TCGA) three types of genomic data (gene expression, mi RNA expression and methylation) for thousands of cancer samples from 9 cancer types |
| Dataset Splits | No | The paper does not provide specific dataset split information (exact percentages, sample counts, or citations to predefined splits) needed to reproduce the data partitioning. |
| Hardware Specification | No | The paper does not provide specific hardware details (exact GPU/CPU models, processor types, or memory amounts) used for running its experiments. |
| Software Dependencies | No | The paper mentions "TensorFlow" but does not specify its version or other software dependencies with version numbers. |
| Experiment Setup | Yes | In all experiments reported here, we set ε and learning rates to 0.01. [...] For QMF, we set the number of target quantiles to m 16, and the regularization factor to 0.001. We train each model for 1,000 epochs, with a batch size of 64 and learning rate of 0.001. |