Handling Noise in Boolean Matrix Factorization
Authors: Radim Belohlavek, Martin Trnecka
IJCAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We present an experimental evaluation of several existing algorithms and compare the results to the observations available in the literature. ... Our new experiments below show that an algorithm may be robust to noise in that the factors computed do not change much when noise is added, yet it may not have a good capability of discovering ground truth, and vice versa. ... We now propose a new experimental scenario to assess robustness to noise of a given BMF algorithm. ... We used data commonly used in BMF experiments, both real and synthetic. |
| Researcher Affiliation | Academia | Radim Belohlavek and Martin Trnecka Dept. of Computer Science, Palack y University Olomouc, Czech Repiblic |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide any statement or link indicating that source code for the described methodology is publicly available. |
| Open Datasets | Yes | As a representative of real data, we present results for the 231 79 Domino dataset (see e.g. [Ene et al., 2008]). ... we briefly report results on the 1000 50 datasets from [Gupta et al., 2008] with noise added. |
| Dataset Splits | No | The paper describes generating synthetic datasets and adding noise for experiments, but it does not specify explicit train/validation/test splits with percentages or sample counts for model training or tuning. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware (e.g., CPU, GPU models, memory) used to run the experiments. |
| Software Dependencies | No | The paper does not provide specific software names with version numbers that would be necessary to replicate the experiments. |
| Experiment Setup | Yes | We only report results for synthetic data matrices M of size 500 250 obtained as Boolean products A B of 500 k and k 250 randomly generated matrices A and B for varying k with density of M (percentage of 1s) around 15%. ... 1000 factorizations for each noise level added to the Domino data were computed for each algorithm. ... Again, 1000 iterations for each level of noise have been performed. |