Exact Representation of Sparse Networks with Symmetric Nonnegative Embeddings

Authors: Sudhanshu Chanpuriya, Ryan Rossi, Anup B. Rao, Tung Mai, Nedim Lipka, Zhao Song, Cameron Musco

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In experiments on real-world networks, we demonstrate our factorization s effectiveness on a variety of tasks, including community detection and link prediction. ... 6 Experiments ... 6.3 Results
Researcher Affiliation Collaboration Sudhanshu Chanpuriya1, Ryan A. Rossi2, Anup Rao2, Tung Mai2, Nedim Lipka2, Zhao Song2, and Cameron Musco3 1University of Illinois Urbana-Champaign, schariya@illinois.edu 2Adobe Research, {ryrossi,anuprao,tumai,lipka,zsong}@adobe.com 3University of Massachusetts Amherst, cmusco@cs.umass.edu
Pseudocode Yes Algorithm 1 Converting LPCA Factors to Community Factors ... Algorithm 2 Fitting the Constrained Model
Open Source Code Yes We include code in the form of a Jupyter notebook (Pérez & Granger, 2007) demo.
Open Datasets Yes We use five fairly common mid-size datasets ranging from around 1K to 10K nodes. ... BLOG Tang & Liu (2009) ... YOUTUBE Yang & Leskovec (2015) ... POS Qiu et al. (2018) ... PPI Breitkreutz et al. (2007) ... AMAZON Yang & Leskovec (2015)
Dataset Splits No The paper describes a 90% training / 10% test split for link prediction but does not specify a separate validation split for general model training or explicit percentages for all dataset splits needed for full reproducibility.
Hardware Specification No The paper does not provide specific hardware details (exact GPU/CPU models, processor types, or detailed computer specifications) used for running its experiments.
Software Dependencies No The paper mentions using PyTorch and SciPy, but does not provide specific version numbers for these software components.
Experiment Setup Yes We set regularization weight λ = 10 as in Yang & Leskovec (2013). ... up to a max of 200 iterations of optimization. ... We generally use a batch size of 100; we find that the optimization of BIGCLAM often diverges on COMPANY B with this batch size, so we instead use batches of size 1000 for its optimization.