reproducibilityindex.ai

The Tree Ensemble Layer: Differentiability meets Conditional Computation

Authors: Hussein Hazimeh, Natalia Ponomareva, Petros Mol, Zhenyu Tan, Rahul Mazumder

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments on 23 classiﬁcation datasets indicate over 10x speed-ups compared to the differentiable trees used in the literature and over 20x reduction in the number of parameters compared to gradi ent boosted trees, while maintaining competitive performance. Moreover, experiments on CIFAR, MNIST, and Fashion MNIST indicate that replac ing dense layers in CNNs with our tree layer re duces the test loss by 7-53% and the number of parameters by 8x.
Researcher Affiliation	Collaboration	1Massachusetts Institute of Technology 2Google Research 3Google Brain. Correspondence to: Hussein Hazimeh <haz imeh@mit.edu>.
Pseudocode	Yes	Algorithm 1 Conditional Forward Pass
Open Source Code	Yes	We provide an open-source Tensor Flow implementation of TEL along with a Keras interface2. 2https://github.com/google-research/ google-research/tree/master/tf_trees
Open Datasets	Yes	23 of these are from the Penn Machine Learning Benchmarks (PMLB) (Olson et al., 2017), and the 3 remaining are CIFAR-10 (Krizhevsky et al., 2009), MNIST (Le Cun et al., 1998), and Fashion MNIST (Xiao et al., 2017).
Dataset Splits	Yes	For all the experiments, we tune the hyperparameters using Hyperopt (Bergstra et al., 2013) with the Tree-structured Parzen Estimator (TPE). We optimize for either AUC or accuracy with stratiﬁed 5-fold cross-validation.
Hardware Specification	No	No specific hardware details such as GPU or CPU models, or specific cloud instance types, are mentioned in the paper. The paper states that TEL is implemented in 'Tensor Flow 2.0' and includes experiments with 'CNNs', implying computational resources were used, but without specific hardware specifications.
Software Dependencies	Yes	TEL is implemented in Tensor Flow 2.0 using custom C++ kernels for forward and back ward propagation, along with a Keras Python-accessible interface.
Experiment Setup	Yes	For all the experiments, we tune the hyperparameters using Hyperopt (Bergstra et al., 2013) with the Tree-structured Parzen Estimator (TPE). We optimize for either AUC or accuracy with stratiﬁed 5-fold cross-validation. NNs (including TEL) were trained using Keras with the Tensor Flow backend, using Adam (Kingma & Ba, 2014) and cross-entropy loss. As discussed in Section 2, TEL is always preceded by a batch normalization layer. For TEL, we tune the learning rate, batch size, and number of epochs (ranges are in the appendix).