reproducibilityindex.ai

BoXHED: Boosted eXact Hazard Estimator with Dynamic covariates

Authors: Xiaochen Wang, Arash Pakbin, Bobak Mortazavi, Hongyu Zhao, Donald Lee

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate the performance of Bo XHED on simulation experiments, and also use it to analyze a cardiovascular disease dataset from the Framingham Heart Study. ... Table 1 presents the L2-errors for the hazard estimators when applied to the simulated datasets. ... Figure 2a presents the AUCt results for the estimators when applied to data simulated from λ1 (no irelevant covariates).
Researcher Affiliation	Academia	1Biostatistics Department, Yale University, New Haven, Connecticut, USA 2Computer Science & Engineering, Texas A&M University, College Station, Texas, USA 3Goizueta Business School and Department of Biostatistics & Bioinformatics, Emory University, Atlanta, Georgia, USA.
Pseudocode	Yes	Algorithm 1 describes the Bo XHED algorithm for estimating λ(t, x).
Open Source Code	Yes	Bo XHED is available from Git Hub: www.github.com/Bo XHED.
Open Datasets	Yes	We pool together longitudinal records from two prospective cohorts: The Framingham Heart Study original cohort (FHS) and the Framingham Heart Study Offspring Cohort (FHS-OS) (Dawber et al., 1951).
Dataset Splits	Yes	The number of boosting iterations M as well as the maximum number of splits L in each tree are hyperparameters that are chosen via K-fold cross-validation. ... The 9,697 study participants are randomly split into 7,000/2,697 for training/testing.
Hardware Specification	No	The paper does not explicitly describe the specific hardware used (e.g., GPU models, CPU models, or cloud computing instances with specifications) for running its experiments.
Software Dependencies	No	The current version (1.0) is written in Python and uses regression trees as learners. The paper mentions Python but does not provide specific version numbers for Python or any other software dependencies or libraries used for the experiments.
Experiment Setup	Yes	Here, M is the number of boosting iterations, and the default learning rate ν = 0.1 is commonly used in boosting applications. The number of boosting iterations M as well as the maximum number of splits L in each tree are hyperparameters that are chosen via K-fold cross-validation. ... The estimated hazard surfaces are scaled to [0, 1] and are aggregated into four clusters using K-means clustering... SBP is bucketed into quintiles (<115, 115-124, 125-139, 140-149, and 150 mm Hg), and DBP is bucketed in the same way (<70, 70-79, 80-84, 85-89, 90 mm Hg).