Faster Algorithms for Binary Matrix Factorization
Authors: Ravi Kumar, Rina Panigrahy, Ali Rahimi, David Woodruff
ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section we present simple experimental results for our algorithm for k-BMF in the Frobenius norm error. The purpose of these experiments is to demonstrate the practical applicability of our algorithm to real-world data, especially for images, which primarily motivated our work. The datasets we use are the standard MNIST database of handwritten digits (yann.lecun.com/exdb/mnist/) and the ORL databases of black-and-white faces (www.cl.cam.ac.uk/research/dtg/ attarchive/facedatabase.html). |
| Researcher Affiliation | Collaboration | Ravi Kumar 1 Rina Panigrahy 1 Ali Rahimi 1 David P. Woodruff 2 1Google, 1600 Amphitheater Parkway, Mountain View, CA, US. 2CMU, 5000 Forbes Ave, Pittsburgh, PA, US. |
| Pseudocode | No | The paper provides a high-level 'schema' outlining the algorithm steps but does not present formal pseudocode or an explicitly labeled algorithm block. |
| Open Source Code | No | The paper mentions using third-party libraries (pymf, scikit-learn) but does not provide specific access to the source code for the methodology described in this paper. |
| Open Datasets | Yes | The datasets we use are the standard MNIST database of handwritten digits (yann.lecun.com/exdb/mnist/) and the ORL databases of black-and-white faces (www.cl.cam.ac.uk/research/dtg/ attarchive/facedatabase.html). |
| Dataset Splits | No | The paper mentions '60,000 images in the training set' for MNIST but does not provide specific train/validation/test split percentages, absolute counts for all splits, or a detailed splitting methodology for any of the datasets used. |
| Hardware Specification | No | The paper does not provide specific details about the hardware (e.g., GPU/CPU models, memory) used for running its experiments. |
| Software Dependencies | No | The paper mentions 'Python package pymf' and 'scikit-learn' but does not provide specific version numbers for these software dependencies, which are needed for reproducibility. |
| Experiment Setup | Yes | We run this baseline for 10,000 iterations and round the entries of the factors to be in {0, 1}. ...Our goal is to run various algorithms for k-BMF on each matrix and measure the Frobenius norm and the time taken to compute the factorization, for various values of k. |