SWIFT: Scalable Wasserstein Factorization for Sparse Nonnegative Tensors

Authors: Ardavan Afshar, Kejing Yin, Sherry Yan, Cheng Qian, Joyce Ho, Haesun Park, Jimeng Sun6548-6556

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental Results; Datasets and Evaluation Metrics; Classification Performance of SWIFT; We compare the performance of SWIFT with different tensor factorization methods with different loss functions and their variants: The first loss function minimizes sum square loss and has 4 variants: 1) CP-ALS (Bader and Kolda 2007); 2) CPNMU (Bader and Kolda 2007); 3) Supervised CP (Kim et al. 2017); and 4) Similarity based CP (Kim et al. 2017). Table 1 summarizes the classification performance using the factor matrices obtained by SWIFT and the baselines with varying target rank (R {5, 10, 20, 30, 40}). We report the mean and standard deviation of accuracy for BBC News, and that of PRAUC Score for Sutter dataset over the test set. For BBC News, SWIFT outperforms all baselines for different target ranks with relative improvement ranging from 1.69% to 9.65%.
Researcher Affiliation Collaboration Ardavan Afshar,1 Kejing Yin,2 Sherry Yan,3 Cheng Qian,4 Joyce Ho,5 Haesun Park,1 Jimeng Sun6 1 Computational Science and Engineering, Georgia Institute of Technology, Atlanta, USA 2 Department of Computer Science, Hong Kong Baptist University, Hong Kong, China 3 Research Development & Dissemination, Sutter Health, Walnut Creek, USA 4 Analytic Center of Excellence, IQVIA, Cambridge, USA 5 Department of Computer Science, Emory University, Atlanta, USA 6 Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, USA
Pseudocode Yes Algorithm 1: SWIFT Input :X RI1 I2 ... IN + , Cn, NNZn n = 1, ..., N, target rank R, λ, and ρ Output :An RIn R + n = 1, ..., N
Open Source Code No The paper does not provide any explicit statements or links indicating that the source code for the described methodology is publicly available.
Open Datasets Yes BBC News (Greene and Cunningham 2006) is a publicly available dataset from the BBC News Agency for text classification task.
Dataset Splits Yes We performed 5-fold cross validation and split the data into training, validation, and test sets by a ratio of 3:1:1.
Hardware Specification No The paper mentions 'The running time of one iteration in seconds' and that 'none of the baselines can be run with parallelization' but does not provide specific details about the hardware used for running experiments, such as GPU models, CPU types, or memory specifications.
Software Dependencies No The paper mentions 'Efficient MATLAB computations with sparse and factored tensors' in the context of CP-ALS (a baseline reference), but does not provide specific version numbers for software dependencies or libraries used in its own implementation of SWIFT or experiments.
Experiment Setup Yes We performed 5-fold cross validation and split the data into training, validation, and test sets by a ratio of 3:1:1. Table 1 summarizes the classification performance using the factor matrices obtained by SWIFT and the baselines with varying target rank (R {5, 10, 20, 30, 40}). We set R = 40. Algorithm 1 also mentions input parameters λ, and ρ.