reproducibilityindex.ai

Bayesian Probabilistic Numerical Integration with Tree-Based Models

Authors: Harrison Zhu, Xing Liu, Ruya Kang, Zhichao Shen, Seth Flaxman, Francois-Xavier Briol

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	The advantages and disadvantages of this new methodology are highlighted on a set of benchmark tests including the Genz functions, and on a Bayesian survey design problem.
Researcher Affiliation	Academia	Harrison Zhu, Xing Liu Imperial College London {hbz15,xl6116}@ic.ac.uk Ruya Kang Brown University ruya_kang@brown.edu Zhichao Shen University of Oxford zhichao.shen@new.ox.ac.uk Seth Flaxman Imperial College London s.flaxman@imperial.ac.uk François-Xavier Briol University College London f.briol@ucl.ac.uk
Pseudocode	Yes	Algorithm 1 Sequential Design for BART-Int
Open Source Code	No	The paper states that an external tool `dbarts` was used ("For BART-Int, we used the default prior settings in dbarts [20]"), but it does not provide a link or explicit statement about the release of its own source code for the methodology described.
Open Datasets	Yes	We use individual-level anonymised census data from the United States [79] ... [79] U.S. Census Bureau. American Community Survey, 2012-2016 ACS 5-Year PUMS Files. Technical report, U.S. Department of Commerce, Janurary 2018.
Dataset Splits	No	The paper describes how data points were selected for sequential design and numerical integration (e.g., "nini = 20d design points", "nseq = 20d additional points"), and how ground truth was computed for evaluation, but it does not specify traditional train/validation/test dataset splits with percentages or counts for model training or hyperparameter tuning.
Hardware Specification	No	The paper discusses computational complexity and run-times (Figure 2) but does not provide specific details about the hardware (e.g., CPU, GPU models, memory) used for running the experiments.
Software Dependencies	No	The paper mentions using "dbarts [20]" for BART-Int but does not specify a version number for this or any other software dependency.
Experiment Setup	Yes	For BART-Int, we used the default prior settings in dbarts [20], whereas for GP-BQ we used a Matérn kernel whose lengthscale was chosen through maximum likelihood. ... The MAPE is given by given by 1/r Σt=1 \|Π[f] − Πˆt[f]\|/\|Π[f]\|, where Πˆt[f] for t = 1, . . . , r, are estimates of Π[f] for r different initial i.i.d. uniform point sets. ... BART-Int (m = 1500, T = 200 m = 1000, T = 50, with a burn-in of 1000 and keeping every 5 samples afterwards) ... The number of post-burn-in samples is chosen to be 10^4. We set γ = 2, di = 0.5i and ci = 0.2i. ... We randomly select our initial set (of size nini = 20) and candidate set (of size S = 10,000).