Automatic Bayesian Density Analysis

Authors: Antonio Vergari, Alejandro Molina, Robert Peharz, Zoubin Ghahramani, Kristian Kersting, Isabel Valera5207-5215

AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In our extensive experimental evaluation, we demonstrate that ABDA effectively assists domain experts in both transductive and inductive settings. We empirically evaluate ABDA on synthetic and real-world datasets both as a density estimator and as a tool to perform several exploratory data analysis tasks.
Researcher Affiliation Collaboration Antonio Vergari antonio.vergari@tue.mpg.de MPI-IS, Tuebingen, Germany Alejandro Molina molina@cs.tu-darmstadt.de TU Darmstadt, Germany Robert Peharz rp587@cam.ac.uk University of Cambridge, UK Zoubin Ghahramani zoubin@cam.ac.uk University of Cambridge, UK Uber AI Labs, USA Kristian Kersting kersting@cs.tu-darmstadt.de TU Darmstadt, Germany Isabel Valera isabel.valera@tue.mpg.de MPI-IS, Tuebingen, Germany
Pseudocode Yes Algorithm 1 Gibbs sampling inference in ABDA
Open Source Code Yes Supplementary material and a reference implementation of ABDA are available at github.com/probabilistic-learning/ abda.
Open Datasets Yes From ISLV and MSPN original works we select 12 real-world datasets differing w.r.t. size and feature heterogeneity. Appendix C reports detailed dataset information... For example, the "Wine quality dataset" and "Abalone dataset" are commonly used public benchmarks.
Dataset Splits Yes For the transductive setting, we randomly remove either 10% or 50% of the data entries, reserving an additional 2% as a validation set for hyperparameter tuning (when required), and repeating five times this process for robust evaluation. For the inductive scenario, we split the data into train, validation, and test (70%, 10%, and 20% splits).
Hardware Specification No The paper does not provide specific hardware details (e.g., CPU, GPU models, or memory) used for running its experiments. It lacks any mention of hardware specifications.
Software Dependencies No The paper states: "We implemented ABDA by leveraging the SPFlow library". While a library is mentioned, no specific version number for SPFlow or any other software dependency is provided.
Experiment Setup Yes In all experiments, we use a symmetric Dirichlet prior with γ = 10 for sum weights Ω and a sparse symmetric prior with α = 0.1 for the leaf likelihood weights wd j . For ABDA and ISLV, we run 5000 iterations of Gibbs sampling, discarding the first 4000 for burn-in. We learn MSPNs with the same hyper-parameters as for ABDA structure learning, i.e., stopping to grow the network when the data to be split is less than 10% of the dataset, while employing a grid search in {0.3, 0.5, 0.7} for the RDC dependency test threshold.