reproducibilityindex.ai

Verifying Robustness of Gradient Boosted Models

Authors: Gil Einziger, Maayan Goldstein, Yaniv Sa’ar, Itai Segall2446-2453

AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We extensively evaluate VERIGB on publicly available datasets and demonstrate a capability for verifying large models.
Researcher Affiliation	Collaboration	Gil Einziger, Maayan Goldstein, Yaniv Sa ar, Itai Segall Nokia, Bell Labs gilein@bgu.ac.il, {maayan.goldstein, yaniv.saar, itai.segall}@nokia-bell-labs.com
Pseudocode	No	The paper describes the encoding of models and properties using mathematical and logical formulas but does not include any pseudocode or algorithm blocks.
Open Source Code	No	The paper mentions implementing VERIGB in Python, but does not provide any link or explicit statement about the availability of its source code.
Open Datasets	Yes	The House Sales in King County (HSKC) dataset containing 22K observations of houses sold in between May 2014 and May 2015 in King County, USA (housing 2018). The Modiﬁed National Institute of Standards and Technology (MNIST) dataset containing 70K images of handwritten digits (Le Cun 1998). The German Trafﬁc Sign Recognition Benchmark (GTSRB) dataset containing 50K colored images of trafﬁc signs (Houben et al. 2013).
Dataset Splits	No	The paper describes the training and evaluation of models but does not explicitly mention or specify a distinct validation set split or its proportions.
Hardware Specification	Yes	We conducted the experiments on a VM with 36 cores, a CPU speed of 2.4 GHz, a total of 150 GB memory, and the Ubuntu 16.04 operating system. The VM is hosted by a designated server with two Intel Xeon E5-2680v2 processors (each processor is made of 28 cores at 2.4 Ghz), 260 GB memory, and Red Hat Enterprise Linux Server 7.3 operating system.
Software Dependencies	Yes	VERIGB utilizes Z3 (De Moura and Bjørner 2008) as the underlying SMT solver. We used the sklearn (Buitinck et al. 2013) and numpy (Jones et al. 2001) packages to train models.
Experiment Setup	Yes	We trained regressors varying the learning rates in {0.1, 0.2, 0.3}, the number of trees between 50 and 500, and the tree depth in {3, 5, 8, 10}. We trained gradient boosted models for the MNIST and GTSRB datasets with a learning rate of 0.1. We varied the number of trees between 20 and 100, and the maximal tree depth between 3 and 20.