Quasi-Bayes meets Vines

Authors: David Huk, Yuanhe Zhang, Ritabrata Dutta, Mark Steel

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experiments illustrate that the QB-Vine is appropriate for high dimensional distributions ( 64), needs very few samples to train ( 200) and outperforms stateof-the-art methods with analytical forms for density estimation and supervised tasks by a considerable margin. Further details on the experiments are included in Appendix E.
Researcher Affiliation Academia David Huk Department of Statistics University of Warwick David.Huk@warwick.ac.uk Yuanhe Zhang Department of Statistics University of Warwick Yuanhe.Zhang@warwick.ac.uk Mark Steel Department of Statistics University of Warwick m.steel@warwick.ac.uk Ritabrata Dutta Department of Statistics University of Warwick Ritabrata.Dutta@warwick.ac.uk
Pseudocode Yes Algorithm 1 Joint, marginal, and copula density estimation with the Quasi-Bayesian Vine
Open Source Code Yes Further details on the experiments are included in Appendix E. Code is included at https://github.com/Huk-David/QB-Vine.
Open Datasets Yes We evaluate the QB-Vine on density estimation benchmark UCI datasets [4] with small sample sizes ranging from 89 to 506 and dimensionality varying from 12 to 30, adding results for the QB-Vine and PRticle Filter to the experiments of [35].
Dataset Splits Yes We assume a same variance parameter b for all the KDE pair copula estimators in the simplified vine and select it using 10-fold cross-validation, in a data-dependent manner by minimizing the energy score between observations and J=100 copula samples. We report the log predictive score LPS= 1 ntest Pntest k=1 ln p(ntrain)(xk) on a held-out test dataset of size ntest comprised of half the samples with the other half used for training, averaging results over five runs with random partitions each time.
Hardware Specification Yes We ran all experiments on an Intel(R) Core(TM) i7-9700 Processor.
Software Dependencies Yes In experiments, We use the implementation of vine copulas from [88] through a Python interface.
Experiment Setup Yes In experiments, We use a grid search over 50 values from 0.1 to 0.99 to select ρ, independently across dimensions, selecting possibly different values for each. To select the KDE pair copula bandwidth we use a 10-fold cross-validation to evaluate the energy score for 50 values between 2 and 4, as these ranges were appraised to give the best fits on preliminary runs on train data. For energy score evaluations, with marginal predictives, we sample 100 observations and compare them to the training data, while for the copula we simulate 100 samples from the joint to compare with the energy score against training data. Our default choice is a standard Cauchy distribution.