reproducibilityindex.ai

Faster Algorithms for Binary Matrix Factorization

Authors: Ravi Kumar, Rina Panigrahy, Ali Rahimi, David Woodruff

ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section we present simple experimental results for our algorithm for k-BMF in the Frobenius norm error. The purpose of these experiments is to demonstrate the practical applicability of our algorithm to real-world data, especially for images, which primarily motivated our work. The datasets we use are the standard MNIST database of handwritten digits (yann.lecun.com/exdb/mnist/) and the ORL databases of black-and-white faces (www.cl.cam.ac.uk/research/dtg/ attarchive/facedatabase.html).
Researcher Affiliation	Collaboration	Ravi Kumar 1 Rina Panigrahy 1 Ali Rahimi 1 David P. Woodruff 2 1Google, 1600 Amphitheater Parkway, Mountain View, CA, US. 2CMU, 5000 Forbes Ave, Pittsburgh, PA, US.
Pseudocode	No	The paper provides a high-level 'schema' outlining the algorithm steps but does not present formal pseudocode or an explicitly labeled algorithm block.
Open Source Code	No	The paper mentions using third-party libraries (pymf, scikit-learn) but does not provide specific access to the source code for the methodology described in this paper.
Open Datasets	Yes	The datasets we use are the standard MNIST database of handwritten digits (yann.lecun.com/exdb/mnist/) and the ORL databases of black-and-white faces (www.cl.cam.ac.uk/research/dtg/ attarchive/facedatabase.html).
Dataset Splits	No	The paper mentions '60,000 images in the training set' for MNIST but does not provide specific train/validation/test split percentages, absolute counts for all splits, or a detailed splitting methodology for any of the datasets used.
Hardware Specification	No	The paper does not provide specific details about the hardware (e.g., GPU/CPU models, memory) used for running its experiments.
Software Dependencies	No	The paper mentions 'Python package pymf' and 'scikit-learn' but does not provide specific version numbers for these software dependencies, which are needed for reproducibility.
Experiment Setup	Yes	We run this baseline for 10,000 iterations and round the entries of the factors to be in {0, 1}. ...Our goal is to run various algorithms for k-BMF on each matrix and measure the Frobenius norm and the time taken to compute the factorization, for various values of k.