Learning the Finer Things: Bayesian Structure Learning at the Instantiation Level
Authors: Chase Yakaboski, Eugene Santos, Jr
AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | By leveraging Bayesian Knowledge Bases (BKBs), a framework that operates at the instantiation level and inherently subsumes Bayesian Networks (BNs), we develop both a theoretical MDL score and associated structure learning algorithm that demonstrates significant improvements over learned BNs on 40 benchmark datasets. Further, our algorithm incorporates recent off-the-shelf DAG learning techniques enabling tractable results even on large problems. We then demonstrate the utility of our approach in a significantly under-determined domain by learning gene regulatory networks on breast cancer gene mutational data available from The Cancer Genome Atlas (TCGA). |
| Researcher Affiliation | Academia | Thayer School of Engineering at Dartmouth College, Hanover, NH {chase.th, esj}@dartmouth.edu |
| Pseudocode | Yes | Algorithm 1: BKB Structure Learning Input: Dataset D, Source Reliabilities R, DAG learning algorithm f and hyperparameters Θ 1: K 2: for τ D = do 3: Gτ f(τ, R, Θ) 4: K K {Gτ} 5: end for 6: return BKB-Fusion(K, R) |
| Open Source Code | Yes | For source code visit: https://github.com/di2ag/pybkb. |
| Open Datasets | Yes | We then demonstrate the utility of our approach in a significantly under-determined domain by learning gene regulatory networks on breast cancer gene mutational data available from The Cancer Genome Atlas (TCGA). (Tomczak, Czerwińska, and Wiznerowicz 2015) |
| Dataset Splits | Yes | We performed a 10-fold classification cross validation on a subset of only 22 datasets due to the increased learning and reasoning time incurred by running cross validation analysis. |
| Hardware Specification | No | No specific hardware details such as GPU/CPU models, memory, or cloud instance specifications used for running experiments are provided. |
| Software Dependencies | No | The paper mentions using 'GOBNILP' but does not specify any software names with version numbers for reproducibility. |
| Experiment Setup | No | The paper mentions 'hyperparameters Θ' in Algorithm 1 and refers to Appendix A for 'naming conventions and feature selection process', but does not explicitly detail specific hyperparameter values (e.g., learning rate, batch size) or other system-level training settings in the main text. |