Faster Algorithms for Binary Matrix Factorization

Authors: Ravi Kumar, Rina Panigrahy, Ali Rahimi, David Woodruff

ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section we present simple experimental results for our algorithm for k-BMF in the Frobenius norm error. The purpose of these experiments is to demonstrate the practical applicability of our algorithm to real-world data, especially for images, which primarily motivated our work. The datasets we use are the standard MNIST database of handwritten digits (yann.lecun.com/exdb/mnist/) and the ORL databases of black-and-white faces (www.cl.cam.ac.uk/research/dtg/ attarchive/facedatabase.html).
Researcher Affiliation Collaboration Ravi Kumar 1 Rina Panigrahy 1 Ali Rahimi 1 David P. Woodruff 2 1Google, 1600 Amphitheater Parkway, Mountain View, CA, US. 2CMU, 5000 Forbes Ave, Pittsburgh, PA, US.
Pseudocode No The paper provides a high-level 'schema' outlining the algorithm steps but does not present formal pseudocode or an explicitly labeled algorithm block.
Open Source Code No The paper mentions using third-party libraries (pymf, scikit-learn) but does not provide specific access to the source code for the methodology described in this paper.
Open Datasets Yes The datasets we use are the standard MNIST database of handwritten digits (yann.lecun.com/exdb/mnist/) and the ORL databases of black-and-white faces (www.cl.cam.ac.uk/research/dtg/ attarchive/facedatabase.html).
Dataset Splits No The paper mentions '60,000 images in the training set' for MNIST but does not provide specific train/validation/test split percentages, absolute counts for all splits, or a detailed splitting methodology for any of the datasets used.
Hardware Specification No The paper does not provide specific details about the hardware (e.g., GPU/CPU models, memory) used for running its experiments.
Software Dependencies No The paper mentions 'Python package pymf' and 'scikit-learn' but does not provide specific version numbers for these software dependencies, which are needed for reproducibility.
Experiment Setup Yes We run this baseline for 10,000 iterations and round the entries of the factors to be in {0, 1}. ...Our goal is to run various algorithms for k-BMF on each matrix and measure the Frobenius norm and the time taken to compute the factorization, for various values of k.