GOCPT: Generalized Online Canonical Polyadic Tensor Factorization and Completion

Authors: Chaoqi Yang, Cheng Qian, Jimeng Sun

IJCAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results demonstrate that our GOCPT can improve fitness by up to 2.8% on the JHU Covid data and 9.2% on a proprietary patient claim dataset over baselines. Our variant GOCPTE shows up to 1.2% and 5.5% fitness improvement on two datasets with about 20% speedup compared to the best model.
Researcher Affiliation Collaboration Chaoqi Yang1 , Cheng Qian2 and Jimeng Sun1 1Department of Computer Science, University of Illinois Urbana-Champaign 2Analytics Center of Excellence, IQVIA
Pseudocode Yes Algorithm 1: Factor Updates for LE at time t
Open Source Code Yes The first version of GOCPT package has been released in Py PI1 and open-sourced in Git Hub2. Supplementary of this paper can be found in the same Git Hub repository. (Footnote 2: https://github.com/ycq091044/GOCPT)
Open Datasets Yes We use (i) JHU Covid data [Dong et al., 2020] and (ii) proprietary Patient Claim data to conduct the evaluation... ORL Database of Faces (FACE-3D) and Google Covid Search Symptom data (GCSS)... Indian Pines hyperspectral image dataset and a proprietary Covid disease counts data: location by disease by date, we call it the health tensor (Covid HT).
Dataset Splits No The paper states 'The leading 50% slices are used as preparation data' and 'add one slice at each time step', implying a train/test split for the online setting, but does not explicitly define a separate validation split or specific percentages for a distinct validation set.
Hardware Specification No The paper does not provide specific hardware details such as GPU models, CPU types, or memory specifications used for running experiments.
Software Dependencies No The paper mentions the GOCPT package is released in PyPI and GitHub but does not list specific versions of Python, libraries (e.g., PyTorch, NumPy, SciPy), or other software dependencies with their version numbers.
Experiment Setup Yes All experiments are conduct with 5 random seeds. The mean and standard deviations are reported. The leading 50% slices are used as preparation data to obtain the initial factors with rank R = 5. We generate three CP factors from uniform [0, 1] distribution and then construct a low-rank tensor (I1, I2, I3, R) = (50, 50, 500, 5). We use the leading 10% slices along the (third) temporal mode as preparation; then, we add one slice at each time step to simulate mode growth. We randomly mask out 98% of the entries.