BAMDT: Bayesian Additive Semi-Multivariate Decision Trees for Nonparametric Regression
Authors: Zhao Tang Luo, Huiyan Sang, Bani Mallick
ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate the superior performance of the proposed method using simulation data and a Sacramento housing price data set. |
| Researcher Affiliation | Academia | 1Department of Statistics, Texas A&M University, College Station, TX, USA. |
| Pseudocode | Yes | Algorithm 1 Connecting connected components in G |
| Open Source Code | Yes | An implementation of the proposed model is available at https://github.com/ztluostat/BAMDT. |
| Open Datasets | Yes | We apply BAMDT to analyze housing price data in Sacramento County, California, available in R package caret (Kuhn, 2021). ... Sacramento County GIS. City boundaries: Sacramento County, California, 2015, 2015. URL https://earthworks.stanford.edu/catalog/stanford-kq595nj1377. |
| Dataset Splits | Yes | We simulate features for a test data set of size ntest = 200. ... We first compare the prediction performance of the five models using 5-fold cross-validation. |
| Hardware Specification | No | The paper mentions computation time and implementation languages (R, C++) but does not provide specific hardware details such as CPU/GPU models or memory. |
| Software Dependencies | No | The paper lists R packages used (igraph, fdaPDE, BART, GpGp, mgcv) along with their primary citation years, but does not specify exact version numbers for these software components. |
| Experiment Setup | Yes | We use M = 50 weak learners in BAMDT. For each weak learner, we randomly sample t = 100 locations from the training data as reference knots. ... We use 100 equally spaced grid points as candidates of univariate split cutoffs for each unstructured feature. The probability of performing a multivariate split is set to be pm = 2/(2 + p). ... Following Chipman et al. (2010), we choose α = 0.95 and β = 2 in (6)... We choose a = 2 by default... We choose ν = 3 and calibrate the prior by selecting λs... we run the MCMC algorithms for 30, 000 iterations, discarding the first half and retaining samples every 10 iterations. |