Undercover Boolean Matrix Factorization with MaxSAT
Authors: Florent Avellaneda, Roger Villemaire3672-3681
AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | From a practical standpoint, experimental results indicate that our blockoptimal k-undercover algorithm outperforms the state-of-the-art even when compared with algorithms for the more general k-undercover Boolean Matrix Factorization problem for which only minimizing reconstruction error is required. All experiments have been performed on Ubuntu 20.04 with Intel Core i7-2600K CPU @ 3.40GHz and 12 GB of RAM. We have performed experiments on 25 datasets from UCI (Dua and Graff 2017). |
| Researcher Affiliation | Academia | 1 Universit e du Qu ebec a Montr eal (UQAM), Montr eal, Canada 2 Centre de Recherche de l Institut Universitaire de G eriatrie de Montr eal, Montr eal, Canada |
| Pseudocode | Yes | Algorithm 1: Generate Cardinalities; Algorithm 2: Optimal Undercover; Algorithm 3: Opti Block; Algorithm 4: Fast Undercover. |
| Open Source Code | Yes | Our algorithms have been implemented in C++ and in this section, we evaluate our methods Fast Undercover (Algorithm 4), Opti Block (Algorithm 3) with Am k and Bk n initialized to {0}m k, {0}k n and Opti Block which corresponds to first running Fast Undercover, then using the solution found as the initial values of Am k and Bk n for Opti Block. See https://github.com/Florent Avellaneda/Undercover BMF |
| Open Datasets | Yes | We have performed experiments on 25 datasets from UCI (Dua and Graff 2017), namely: Audiology, Autism Screening Adult, Balance Scale, Breast Cancer, Car Evaluation, Chess (King-Rook vs. King), Congressional Voting Records, Contraceptive Method Choice, Dermatology, Hepatitis, Iris, Lung Cancer, Lymphography, Mushroom, Nursery, Primary Tumor, Solar Flare, Soybean (Large), Statlog (Heart), Student Performance, Thoracic Surgery, Tic-Tac-Toe Endgame, Website Phishing, Wine and Zoo. The binarized datasets used are also available in the appendix. |
| Dataset Splits | No | The paper does not explicitly provide training/test/validation dataset splits or cross-validation details for the experiments performed on the UCI datasets. It only describes how k values were chosen for factorization. |
| Hardware Specification | Yes | All experiments have been performed on Ubuntu 20.04 with Intel Core i7-2600K CPU @ 3.40GHz and 12 GB of RAM. |
| Software Dependencies | Yes | All experiments have been performed on Ubuntu 20.04. We compare the time required to solve randomly generated problems with all Max SAT solvers competing in the Max SAT Evaluation 2021, i.e., Max HS (Bacchus 2021), CASHWMax SAT (Lei et al. 2021), Eval Max SAT (Avellaneda 2020), UWr Max SAT (Piotr ow 2021; Piotrow 2020), Open-WBO-RES (Martins et al. 2021; Martins, Manquinho, and Lynce 2014), Pacose (Paxian and Becker 2021) and Exact (Devriendt 2021). |
| Experiment Setup | Yes | We randomly generate two matrices A100 10 and B10 100 and assigned the value 1 to each entry with the probability of p 1 10 1 0.1. Thus, ensuring that the probability of having a 1 in a entry of A B is 10%. We then calculate X = (A B) and remove 10% of entries randomly. For each dataset, we compute the factorization over three difference rank value kmax 4 2 and kmax 4 3 where kmax is a trivial rank for the dataset to factorize (generally the minimum between the number of lines and the number of columns of the dataset). In our evaluation, we set the time budget to twice the time used by Opti Block. |