reproducibilityindex.ai

Being Properly Improper

Authors: Tyler Sypherd, Richard Nock, Lalitha Sankar

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We study the twist-proper α-loss under a novel boosting algorithm, called PILBOOST, and provide formal and experimental results for this algorithm. Our overarching practical conclusion is that the twistproper α-loss outperforms the proper log-loss on several variants of twisted data. In Section 6, we implement PILBOOST with the approximate inverse canonical link of α-loss on several tabular datasets, each suffering from various twists (label, feature, and adversarial noise), and compare against Ada Boost (Freund & Schapire, 1997) and XGBoost (Chen & Guestrin, 2016).
Researcher Affiliation	Collaboration	1School of Electrical, Computer and Energy Engineering, Arizona State University; 2Google Research. Correspondence to: Tyler Sypherd <tsypherd@asu.edu>.
Pseudocode	Yes	Algorithm 1 PILBOOST
Open Source Code	Yes	The code for all of our experiments (including the implementation of PILBOOST) can be found at the following github repository link: https://github.com/Sankar Lab/Being-Properly-Improper
Open Datasets	Yes	We provide experimental results on PILBOOST (for α {1.1, 2, 4}) and compare with Ada Boost (Freund & Schapire, 1997) and XGBoost (Chen & Guestrin, 2016) on four binary classification datasets, namely, cancer (Wolberg et al., 1995), xd6 (Buntine & Niblett, 1992), diabetes (Smith et al., 1988), and online shoppers intention (Sakar et al., 2019).
Dataset Splits	No	The paper mentions 'train/test split' and 'cross-validation' but does not explicitly describe a separate 'validation' dataset split for hyperparameter tuning.
Hardware Specification	Yes	Most of the experiments were performed over the course of a month on a 2015 Mac Book Pro with a 2.2 GHz Quad-Core Intel Core i7 processor and 16GB of memory. The Adaptive α experiments were performed on a computing cluster and each required about 30 minutes of compute time.
Software Dependencies	No	The paper mentions using decision trees and XGBoost but does not provide specific version numbers for any software dependencies or libraries.
Experiment Setup	Yes	All algorithms across all experiments ran for 1000 iterations. For α = 1.1, 2, and 4, we set af = 7, 2, and 4, respectively. Hyperparameters of XGBoost were kept to default to maintain the fairest comparison between the three algorithms; for more of these experimental details, please refer to Appendix B.5. All experiments use regression decision trees (of varying depths 1-3) in order to align with XGBoost.