Globally Induced Forest: A Prepruning Compression Scheme
Authors: Jean-Michel Begon, Arnaud Joly, Pierre Geurts
ICML 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In Section 4, we show that our proposed algorithm, with its default setting, performs well on many datasets, sometimes even surpassing much larger models. We then conduct an extensive analysis of its hyper-parameters (Section 4.2). |
| Researcher Affiliation | Academia | 1Department of Electrical Engineering and Computer Science University of Li ege, Li ege, Belgium. Correspondence to: Jean-Michel Begon <jm.begon@ulg.ac.be>, Pierre Geurts <p.geurts@ulg.ac.be>. |
| Pseudocode | Yes | Algorithm 1 Globally Induced Forest |
| Open Source Code | Yes | 1The code is readily available at https://github.com/ jm-begon/globally-induced-forest |
| Open Datasets | Yes | All the results presented in this section are averaged over ten folds with different learning sample/testing sample splits. See the Supplementary material for detailed information on the datasets. DATASET ET100% ET10% GIF10% ET1% GIF1% FRIEDMAN1 ... ABALONE ... CT SLICE ... HWANG F5 ... CADATA ... RINGNORM ... TWONORM ... HASTIE ... MUSK2 ... MADELON ... MNIST8VS9 ... BIN. VOWEL ... BIN. MNIST ... BIN. LETTER ... WAVEFORM ... VOWEL ... MNIST ... LETTER |
| Dataset Splits | Yes | All the results presented in this section are averaged over ten folds with different learning sample/testing sample splits. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., CPU, GPU models, memory) used for running the experiments. |
| Software Dependencies | Yes | The extremely randomized trees were computed with version 0.18 of Scikit-Learn (Pedregosa et al., 2011) |
| Experiment Setup | Yes | For GIF, we started with T = 1000 stumps, a learning rate of λ = 10^-1.5 and CW = 1. The underlying tree building algorithm is ET with no restriction regarding the depth and p features are examined for each split, in both classification and regression. We will refer to this parameter setting as the default one. |