Learning a Latent Simplex in Input Sparsity Time

Authors: Ainesh Bakshi, Chiranjib Bhattacharyya, Ravi Kannan, David Woodruff, Samson Zhou

ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section, we describe a series of experiments that demonstrate the advantage of our algorithm, performed in Python 3.6.9 on an Intel Core i7-8700K 3.70 GHz CPU with 12 cores and 64GB DDR4 memory, using an Nvidia Geforce GTX 1080 Ti 11GB GPU, on both synthetic and real-world data.
Researcher Affiliation Collaboration Ainesh Bakshi Carnegie Mellon University abakshi@cs.cmu.edu Chiranjib Bhattacharyya Indian Institute of Science chiru@iisc.ac.in Ravi Kannan Microsoft Research India kannan@microsoft.com David P. Woodruff Carnegie Mellon University dwoodruf@cs.cmu.edu Samson Zhou Carnegie Mellon University samsonzhou@gmail.com
Pseudocode Yes Algorithm 1 : Learning a Latent k-Simplex in Input Sparsity Time
Open Source Code No The paper does not provide any explicit statement or link indicating that source code for the described methodology is publicly available.
Open Datasets Yes We also evaluate the algorithms on the email-Eu-core network dataset of interactions across email data between individuals from a large European research institution (Yin et al., 2017; Leskovec et al., 2007) and the com-Youtube dataset of friendships on the Youtube social network (Yang & Leskovec, 2015), both accessed through the Stanford Network Analysis Project (SNAP).
Dataset Splits No The paper describes the synthetic data generation parameters (n, d, k, p) and real-world dataset sizes (n, d) and community counts (k), but it does not specify any training, validation, or test dataset splits.
Hardware Specification Yes performed in Python 3.6.9 on an Intel Core i7-8700K 3.70 GHz CPU with 12 cores and 64GB DDR4 memory, using an Nvidia Geforce GTX 1080 Ti 11GB GPU
Software Dependencies No The paper mentions "Python 3.6.9" and "the svds method from the sparse scipy linalg package optimized by LAPACK". While Python version is given, specific versions for scipy or LAPACK are not provided, which is insufficient for full reproducibility.
Experiment Setup Yes Since our theoretical results are most interesting when k d n, we set n = 50000, d = 1000, k {20, 50, 100} and generate a random d n matrix A that consists of independent entries that are each 1 with probability p 1/500, 1/2000, 1/5000 and 0 with probability 1 p. ... Finally, we consider a full end-to-end implementation comparing the runtime and least squares loss of the top k subspace algorithm and our input sparsity approximation algorithm over various ranges of the parameter k and smoothening parameter δn on the com-Youtube dataset...