Adaptive Sketching for Fast and Convergent Canonical Polyadic Decomposition

Authors: Alex Gittens, Kareem Aggour, Bülent Yener

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental On both synthetic and real data we observe that for noisy tensors CPD-MWU produces decompositions of comparable accuracy to the standard CPD decomposition in less time, often half the time; for ill-conditioned tensors, given the same time budget, CPD-MWU produces decompositions with an order-of-magnitude lower relative error. For a representative real-world dataset CPD-MWU produces residual errors on average 20% lower than CPRAND-MIX and 44% lower than SPALS, two recent sketched CPD algorithms.
Researcher Affiliation Collaboration 1GE Global Research, Niskayuna, New York, USA 2Rensselaer Polytechnic Institute, Troy, New York, USA.
Pseudocode Yes Algorithm 1 CPD-MULTIPLICATIVE WEIGHTS UPDATE
Open Source Code Yes CPD-MWU is available at https://github.com/kaggour/CPD-MWU.
Open Datasets Yes The Carnegie Mellon Never Ending Language Learning (NELL) repository (Carlson et al., 2010) has previously been analyzed using CPD (Kang et al., 2012).
Dataset Splits No The paper states: "Each synthetic tensor is rank-5 and in R366 366 100,000, split evenly in the 3rd mode into 20K slices for distributed processing in Spark." This describes data partitioning for distributed processing, not training/validation/test splits with explicit percentages or strategies for model validation.
Hardware Specification No The paper mentions running experiments in a "distributed setting" using "Apache Spark" on "commodity hardware" but does not provide specific details about the CPU, GPU, or memory used (e.g., specific model numbers or configurations).
Software Dependencies No The paper mentions implementations in "Python" and using "Apache Spark (Zaharia et al., 2010)", but does not specify version numbers for either.
Experiment Setup Yes Five sketching rates were used for the CPD-MWU experiments, with four rates linearly spaced in the interval [10 6 , 10 4]. The fifth sketching rate was set to 1 so that CPD-MWU could use the full tensor if it determined that to be advantageous. We observed that the performance is robust to the choice of η and used the value of 2 for our experiments; we set ε to 0.15, which is smaller than 1 N , to amortize the costs of the solves as described in the discussion following Listing 1. For decomposing ill-conditioned tensors, we use proximal regularization (λ=0.001) for CPD-MWU and Sketched CPD.