Fast and Guaranteed Tensor Decomposition via Sketching
Authors: Yining Wang, Hsiao-Yu Tung, Alexander J. Smola, Anima Anandkumar
NeurIPS 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on synthetic and real-world datasets show highly competitive results. We demonstrate a 10x to 100x speed-up over exact methods for decomposing dense, high-dimensional tensors. For topic modeling, we show a significant reduction in computational time over existing spectral LDA implementations with small performance loss. In addition, our proposed algorithm outperforms collapsed Gibbs sampling when running time is constrained. We also show that if a Gibbs sampler is initialized with our output topics, it converges within several iterations and outperforms a randomly initialized Gibbs sampler run for much more iterations. |
| Researcher Affiliation | Academia | Yining Wang, Hsiao-Yu Tung, Alex Smola Machine Learning Department Carnegie Mellon University, Pittsburgh, PA 15213 {yiningwa,htung}@cs.cmu.edu alex@smola.org Anima Anandkumar Department of EECS University of California Irvine Irvine, CA 92697 a.anandkumar@uci.edu |
| Pseudocode | Yes | Algorithm 1 Fast robust tensor power method |
| Open Source Code | No | The paper does not provide any explicit statements about open-sourcing the code for the methodology described, nor does it include links to a code repository. |
| Open Datasets | Yes | We compare our proposed fast spectral LDA algorithm with baseline spectral methods and collapsed Gibbs sampling (using Gibbs LDA++ [25] implementation) on two real-world datasets: Wikipedia and Enron. |
| Dataset Splits | No | Obtained topic models Φ RV K are evaluated on a held-out dataset consisting of 1000 documents randomly picked out from training datasets. (This describes a held-out test set, but not a separate validation set or specific train/validation/test splits for reproducibility.) |
| Hardware Specification | Yes | All methods are implemented in C++ and tested on a single machine with 8 Intel X5550@2.67Ghz CPUs and 32GB memory. |
| Software Dependencies | No | All methods are implemented in C++ and tested on a single machine with 8 Intel X5550@2.67Ghz CPUs and 32GB memory. We compare our proposed fast spectral LDA algorithm with baseline spectral methods and collapsed Gibbs sampling (using Gibbs LDA++ [25] implementation)... (No specific version numbers for C++ or Gibbs LDA++). |
| Experiment Setup | Yes | For the robust tensor power method the parameters are set to L = 50 and T = 30. For ALS we iterate until convergence, or a maximum number of 1000 iterations is reached. α0 is set to 1.0 and B is set to 30. For collapsed Gibbs sampling, we set α = 50/K and β = 0.1 following [11]. |