DFacTo: Distributed Factorization of Tensors
Authors: Joon Hee Choi, S. Vishwanathan
NeurIPS 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 5 Experimental Evaluation Our experiments are designed to study the scaling behavior of DFac To on both publicly available real-world datasets as well as synthetically generated data. We contrast the performance of DFac To (ALS) with Giga Tensor [8] as well as with CPALS [7], while the performance of DFac To (GD) is compared with CPOPT [9]. We also present results to show the scaling behavior of DFac To when data is distributed across multiple machines. Datasets See Table 1 for a summary of the real-world datasets we used in our experiments. |
| Researcher Affiliation | Academia | Joon Hee Choi Electrical and Computer Engineering Purdue University West Lafayette IN 47907 choi240@purdue.edu S. V. N. Vishwanathan Statistics and Computer Science Purdue University West Lafayette IN 47907 vishy@stat.purdue.edu |
| Pseudocode | Yes | Algorithm 1: DFac To algorithm for Tensor Factorization |
| Open Source Code | Yes | All our codes are available for download under an open source license from http://www.joonheechoi.com/research. |
| Open Datasets | Yes | Our experiments are designed to study the scaling behavior of DFac To on both publicly available real-world datasets as well as synthetically generated data. Datasets See Table 1 for a summary of the real-world datasets we used in our experiments. The NELL-1 and NELL-2 datasets are from [8]... The Yelp Phoenix dataset is from the Yelp Data Challenge 2, while Cellartracker, Ratebeer, Beeradvocate and Amazon.com are from the Stanford Network Analysis Project (SNAP) home page. |
| Dataset Splits | No | The paper focuses on the speed and scalability of tensor factorization algorithms rather than predictive model performance. It does not provide specific train/validation/test dataset splits, percentages, or sample counts needed to reproduce data partitioning for the datasets used. |
| Hardware Specification | Yes | All experiments were conducted on a computing cluster where each node has two 2.1 GHz 12-core AMD 6172 processors with 48 GB physical memory per node. |
| Software Dependencies | Yes | Our algorithms are implemented in C++ using the Eigen library3 and compiled with the Intel Compiler. We downloaded Version 2.5 of the Tensor Toolbox, which is implemented in MATLAB4. Also, we used MPICH25 in order to distribute the tensor factorization computation to multiple machines. |
| Experiment Setup | No | The paper mentions the rank `R` used in experiments (e.g., "R=10" in Table 2), which is a model parameter. However, it does not provide other typical experimental setup details such as specific learning rates for the Gradient Descent algorithm, convergence criteria, or other system-level training settings commonly found in "experimental setup" sections. |