DFacTo: Distributed Factorization of Tensors

Authors: Joon Hee Choi, S. Vishwanathan

NeurIPS 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental 5 Experimental Evaluation Our experiments are designed to study the scaling behavior of DFac To on both publicly available real-world datasets as well as synthetically generated data. We contrast the performance of DFac To (ALS) with Giga Tensor [8] as well as with CPALS [7], while the performance of DFac To (GD) is compared with CPOPT [9]. We also present results to show the scaling behavior of DFac To when data is distributed across multiple machines. Datasets See Table 1 for a summary of the real-world datasets we used in our experiments.
Researcher Affiliation Academia Joon Hee Choi Electrical and Computer Engineering Purdue University West Lafayette IN 47907 choi240@purdue.edu S. V. N. Vishwanathan Statistics and Computer Science Purdue University West Lafayette IN 47907 vishy@stat.purdue.edu
Pseudocode Yes Algorithm 1: DFac To algorithm for Tensor Factorization
Open Source Code Yes All our codes are available for download under an open source license from http://www.joonheechoi.com/research.
Open Datasets Yes Our experiments are designed to study the scaling behavior of DFac To on both publicly available real-world datasets as well as synthetically generated data. Datasets See Table 1 for a summary of the real-world datasets we used in our experiments. The NELL-1 and NELL-2 datasets are from [8]... The Yelp Phoenix dataset is from the Yelp Data Challenge 2, while Cellartracker, Ratebeer, Beeradvocate and Amazon.com are from the Stanford Network Analysis Project (SNAP) home page.
Dataset Splits No The paper focuses on the speed and scalability of tensor factorization algorithms rather than predictive model performance. It does not provide specific train/validation/test dataset splits, percentages, or sample counts needed to reproduce data partitioning for the datasets used.
Hardware Specification Yes All experiments were conducted on a computing cluster where each node has two 2.1 GHz 12-core AMD 6172 processors with 48 GB physical memory per node.
Software Dependencies Yes Our algorithms are implemented in C++ using the Eigen library3 and compiled with the Intel Compiler. We downloaded Version 2.5 of the Tensor Toolbox, which is implemented in MATLAB4. Also, we used MPICH25 in order to distribute the tensor factorization computation to multiple machines.
Experiment Setup No The paper mentions the rank `R` used in experiments (e.g., "R=10" in Table 2), which is a model parameter. However, it does not provide other typical experimental setup details such as specific learning rates for the Gradient Descent algorithm, convergence criteria, or other system-level training settings commonly found in "experimental setup" sections.