Bayesian Boolean Matrix Factorisation

Authors: Tammo Rukat, Chris C. Holmes, Michalis K. Titsias, Christopher Yau

ICML 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental On real world and simulated data, our method outperforms all currently existing approaches for Boolean matrix factorisation and completion.
Researcher Affiliation Academia 1Department of Statistics, University of Oxford, UK 2Nuffield Department of Medicine, University of Oxford, UK 3Department of Informatics, Athens University of Economics and Business, Greece 4Centre for Computational Biology, Institute of Cancer and Genomic Sciences, University of Birmingham, UK.
Pseudocode Yes Algorithm 1 Computation of the full conditional of znl; Algorithm 2 Sampling from the Or Machine
Open Source Code Yes Code is avaialble on Git Hub1. https://github.com/Tammo R/Or Machine/
Open Datasets Yes The Movie Lens-1M dataset2 contains 106 integer film ratings from 1 to 5 from 6000 users for 4000 films...2The Movie Lens dataset is available online: https://grouplens.org/datasets/movielens/. ...the data is publicly available3. 3https://support.10xgenomics.com
Dataset Splits No The paper describes using a 'random subset of the data' as observed data and reconstructing 'missing data' for matrix completion, which serves as a test, but it does not explicitly define or use a separate 'validation' split for hyperparameter tuning or model selection.
Hardware Specification No The paper mentions 'a desktop computer', '24 computing cores', 'a 4-core desktop computer', and 'a cluster with 24 cores' but does not provide specific details such as CPU/GPU models, memory, or cloud instance types.
Software Dependencies No The algorithm is implemented in Python with the core sampling routines in compiled Cython. No specific version numbers for Python, Cython, or any other libraries are provided.
Experiment Setup Yes For the Or M, we initialise the parameters uniformly at random and draw 100 iterations after 100 samples of burn-in. ... we choose independent Bernoulli sparsity priors for the codes: p(uld) = [0.01, 0.05, 0.2] for each layer, respectively. ... The given values are means from 10 randomly initialised runs of each algorithm. ... We apply the Or Machine for latent dimensions L = 2, . . . , 10. ... We draw 125 samples and discard the first 25 as burn-in.