reproducibilityindex.ai

Efficient Nonparametric Tensor Decomposition for Binary and Count Data

Authors: Zerui Tao, Toshihisa Tanaka, Qibin Zhao

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate our model on several real-world tensor completion tasks, considering binary and count datasets. The results manifest both better performance and computational advantages of the proposed model.
Researcher Affiliation	Academia	1Tokyo University of Agriculture and Technology, Japan 2RIKEN Center for Advanced Intelligence Project (AIP), Japan
Pseudocode	No	The paper does not contain any clearly labeled pseudocode or algorithm blocks. It describes the model and inference steps in prose.
Open Source Code	Yes	The code is mainly based on Py Torch (Paszke et al. 2019) and available at https://github.com/taozerui/gptd
Open Datasets	Yes	Datasets We test our model on three binary tensor datasets: (1) Digg (Xu, Yan, and Qi 2012)... (2) Enron (Xu, Yan, and Qi 2012)... (3) DBLP (Zhe et al. 2016)... We evaluate the proposed model on three count tensors. (1) JHU (Dong, Du, and Gardner 2020)... (2) Article (Zhe and Du 2018)... (3) EMS (Zhe and Du 2018)...
Dataset Splits	Yes	For Digg and Enron, we randomly sample an equal number of zero entries to obtain a balanced dataset. For DBLP, the same train/test split with Zhe et al. (2016) is adopted. For binary datasets, we evaluate the area under the ROC curve (AUC) and the negative log-likelihood (NLL) of estimated Bernoulli distributions. We report the mean and standard deviation of 5-fold cross-validation. The data is fully observed and we use 20% observations to predict the rest entries.
Hardware Specification	Yes	All experiments are conducted on a workstation with an Intel Xeon Silver 4316 CPU@2.30GHz, 512GB RAM and NVIDIA RTX A6000 GPUs.
Software Dependencies	No	The paper mentions "Py Torch (Paszke et al. 2019)" but does not specify a version number for PyTorch or other software dependencies.
Experiment Setup	Yes	All stochastic methods are optimized using batch size 128. Moreover, gradient-based models are optimized using Adam with a learning rate chosen from {3 10 3, 1 10 3, 3 10 4, 1 10 4}, except GCP, whose default optimizer is L-BFGS. We test all methods with different tensor ranks ranging from { 3, 5, 10 }. For GP-based methods, we use 100 inducing points and RBF kernel with bandwidth 1.0, consistent with previous work (Zhe et al. 2016; Zhe and Du 2018). Note that, for ENTED, the inducing points number is 50 + 50 for u and v, respectively.