Fast and Efficient Boolean Matrix Factorization by Geometric Segmentation

Authors: Changlin Wan, Wennan Chang, Tong Zhao, Mengya Li, Sha Cao, Chi Zhang6086-6093

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We compared MEBF with other state of the art approaches on data scenarios with different density and noise levels. MEBF demonstrated superior performances in lower reconstruction error, and higher computational efficiency, as well as more accurate density patterns than popular methods such as ASSO, PANDA and Message Passing.
Researcher Affiliation Collaboration 1Purdue University, 2Indiana University, 3Amazon
Pseudocode Yes Algorithm 1: MEBF Inputs: X {0, 1}n m, t (0, 1),τ Outputs: A {0, 1}n k, B {0, 1}k m MEBF(X, t, τ): Xresidual X, γ0 inf A NULL, B NULL while !τ do (a, b) bidirectional growth(Xresidual, t) Atmp append(A , a) Btmp append(B , b) if γ(Atmp, Btmp; X) > γ0 then (a, b) weak signal detection(Xresidual, t); Atmp append(A , a) Btmp append(B , b) if γ(Atmp, Btmp; X) > γ0 then break ; A append(A , a) B append(B , b) γ0 γ(A , B ; X) Xresidualij 0 when (a b)ij = 1 end
Open Source Code Yes The code is available at https://github.com/clwan/MEBF
Open Datasets Yes Chicago Crime records2 (X {0, 1}6787 752) and head and neck cancer single cell RNA sequencing data3 (X {0, 1}344 5902). 2Chicago crime records downloaded on August 20, 2019 from https://data.cityofchicago.org/Public-Safety 3This head and neck sequencing data can be accessed at https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE103322
Dataset Splits No The paper discusses simulated and real-world datasets, but it does not specify concrete training, validation, or test dataset splits (e.g., percentages, sample counts, or citations to predefined splits).
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU/CPU models, processor types, or memory amounts) used for running its experiments.
Software Dependencies No The paper mentions comparing its method with other algorithms like ASSO, PANDA, and Message Passing, and states that its code is available on GitHub. However, it does not provide specific software dependencies with version numbers (e.g., Python 3.8, PyTorch 1.9).
Experiment Setup Yes For the crime data, parameters (t = 0.7, k = 20) were used. For the single cell data, parameters (t = 0.6, k = 5) were used. The convergence criteria for the algorithms are set as: (1) 10 patterns were identified; (2) or for MEBF, PANDA and ASSO, they will also stop if a newly identified pattern does not decrease the cost function.