Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Fast and Guaranteed Tensor Decomposition via Sketching

Authors: Yining Wang, Hsiao-Yu Tung, Alexander J. Smola, Anima Anandkumar

NeurIPS 2015 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on synthetic and real-world datasets show highly competitive results. We demonstrate a 10x to 100x speed-up over exact methods for decomposing dense, high-dimensional tensors. For topic modeling, we show a significant reduction in computational time over existing spectral LDA implementations with small performance loss. In addition, our proposed algorithm outperforms collapsed Gibbs sampling when running time is constrained. We also show that if a Gibbs sampler is initialized with our output topics, it converges within several iterations and outperforms a randomly initialized Gibbs sampler run for much more iterations.
Researcher Affiliation Academia Yining Wang, Hsiao-Yu Tung, Alex Smola Machine Learning Department Carnegie Mellon University, Pittsburgh, PA 15213 EMAIL EMAIL Anima Anandkumar Department of EECS University of California Irvine Irvine, CA 92697 EMAIL
Pseudocode Yes Algorithm 1 Fast robust tensor power method
Open Source Code No The paper does not provide any explicit statements about open-sourcing the code for the methodology described, nor does it include links to a code repository.
Open Datasets Yes We compare our proposed fast spectral LDA algorithm with baseline spectral methods and collapsed Gibbs sampling (using Gibbs LDA++ [25] implementation) on two real-world datasets: Wikipedia and Enron.
Dataset Splits No Obtained topic models Φ RV K are evaluated on a held-out dataset consisting of 1000 documents randomly picked out from training datasets. (This describes a held-out test set, but not a separate validation set or specific train/validation/test splits for reproducibility.)
Hardware Specification Yes All methods are implemented in C++ and tested on a single machine with 8 Intel X5550@2.67Ghz CPUs and 32GB memory.
Software Dependencies No All methods are implemented in C++ and tested on a single machine with 8 Intel X5550@2.67Ghz CPUs and 32GB memory. We compare our proposed fast spectral LDA algorithm with baseline spectral methods and collapsed Gibbs sampling (using Gibbs LDA++ [25] implementation)... (No specific version numbers for C++ or Gibbs LDA++).
Experiment Setup Yes For the robust tensor power method the parameters are set to L = 50 and T = 30. For ALS we iterate until convergence, or a maximum number of 1000 iterations is reached. α0 is set to 1.0 and B is set to 30. For collapsed Gibbs sampling, we set α = 50/K and β = 0.1 following [11].