reproducibilityindex.ai

Undercover Boolean Matrix Factorization with MaxSAT

Authors: Florent Avellaneda, Roger Villemaire3672-3681

AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	From a practical standpoint, experimental results indicate that our blockoptimal k-undercover algorithm outperforms the state-of-the-art even when compared with algorithms for the more general k-undercover Boolean Matrix Factorization problem for which only minimizing reconstruction error is required. All experiments have been performed on Ubuntu 20.04 with Intel Core i7-2600K CPU @ 3.40GHz and 12 GB of RAM. We have performed experiments on 25 datasets from UCI (Dua and Graff 2017).
Researcher Affiliation	Academia	1 Universit e du Qu ebec a Montr eal (UQAM), Montr eal, Canada 2 Centre de Recherche de l Institut Universitaire de G eriatrie de Montr eal, Montr eal, Canada
Pseudocode	Yes	Algorithm 1: Generate Cardinalities; Algorithm 2: Optimal Undercover; Algorithm 3: Opti Block; Algorithm 4: Fast Undercover.
Open Source Code	Yes	Our algorithms have been implemented in C++ and in this section, we evaluate our methods Fast Undercover (Algorithm 4), Opti Block (Algorithm 3) with Am k and Bk n initialized to {0}m k, {0}k n and Opti Block which corresponds to ﬁrst running Fast Undercover, then using the solution found as the initial values of Am k and Bk n for Opti Block. See https://github.com/Florent Avellaneda/Undercover BMF
Open Datasets	Yes	We have performed experiments on 25 datasets from UCI (Dua and Graff 2017), namely: Audiology, Autism Screening Adult, Balance Scale, Breast Cancer, Car Evaluation, Chess (King-Rook vs. King), Congressional Voting Records, Contraceptive Method Choice, Dermatology, Hepatitis, Iris, Lung Cancer, Lymphography, Mushroom, Nursery, Primary Tumor, Solar Flare, Soybean (Large), Statlog (Heart), Student Performance, Thoracic Surgery, Tic-Tac-Toe Endgame, Website Phishing, Wine and Zoo. The binarized datasets used are also available in the appendix.
Dataset Splits	No	The paper does not explicitly provide training/test/validation dataset splits or cross-validation details for the experiments performed on the UCI datasets. It only describes how k values were chosen for factorization.
Hardware Specification	Yes	All experiments have been performed on Ubuntu 20.04 with Intel Core i7-2600K CPU @ 3.40GHz and 12 GB of RAM.
Software Dependencies	Yes	All experiments have been performed on Ubuntu 20.04. We compare the time required to solve randomly generated problems with all Max SAT solvers competing in the Max SAT Evaluation 2021, i.e., Max HS (Bacchus 2021), CASHWMax SAT (Lei et al. 2021), Eval Max SAT (Avellaneda 2020), UWr Max SAT (Piotr ow 2021; Piotrow 2020), Open-WBO-RES (Martins et al. 2021; Martins, Manquinho, and Lynce 2014), Pacose (Paxian and Becker 2021) and Exact (Devriendt 2021).
Experiment Setup	Yes	We randomly generate two matrices A100 10 and B10 100 and assigned the value 1 to each entry with the probability of p 1 10 1 0.1. Thus, ensuring that the probability of having a 1 in a entry of A B is 10%. We then calculate X = (A B) and remove 10% of entries randomly. For each dataset, we compute the factorization over three difference rank value kmax 4 2 and kmax 4 3 where kmax is a trivial rank for the dataset to factorize (generally the minimum between the number of lines and the number of columns of the dataset). In our evaluation, we set the time budget to twice the time used by Opti Block.