Random Composite Forests

Authors: Giulia DeSalvo, Mehryar Mohri

AAAI 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Using our theoretical analysis, we devise a new algorithm, RANDOMCOMPOSITEFOREST (RCF), that is based on forming an ensemble of random composite trees. We report the results of experiments demonstrating that RCF yields significant performance improvements over both Random Forests and a variant of RCF in several tasks.
Researcher Affiliation Collaboration Giulia De Salvo Courant Institute of Mathematical Sciences 251 Mercer St. New York, NY 10012 desalvo@cims.nyu.edu Mehryar Mohri Courant Institute and Google Research 251 Mercer St. New York, NY 10012 mohri@cs.nyu.edu
Pseudocode Yes Algorithm 1 RANDOMCOMPOSITEFOREST(B, r, γ)
Open Source Code No We implemented RCF, RFs, and RF-SVM by using scikit-learn, (Pedregosa et al.). The paper does not provide its own source code for the described methodology.
Open Datasets Yes We tested RCF on eleven datasets from UCI s data repository: german, vehicle, vowel, dna, pendigits, iris, abalone, and a2a.
Dataset Splits Yes For each dataset, we randomly divided the data into training, validation, and test sets in order to run the RCF algorithm.
Hardware Specification No The paper does not provide specific hardware details (e.g., CPU/GPU models, memory) used for running its experiments.
Software Dependencies No We implemented RCF, RFs, and RF-SVM by using scikit-learn, (Pedregosa et al.). While 'scikit-learn' is mentioned, no version number is provided.
Experiment Setup Yes For the SVM algorithm at the leaves of each composite tree, we allowed the set of polynomial degrees G = {1, . . . , 9}. The number of different sequences of degree values was p = 10. For each polynomial degree δ G, the regularization parameter Cδ {10i : i = 3, . . . , 2} of SVMs was selected via cross-validation and at each leaf k, it was simply scaled by mk m Cδ. The r parameter which determined the size of the subset of features was in the following range: r {1, |F|, . . . , |F|}. The γ parameter that rescales the bound was γ {10i : i = 3, . . . , 0} and the maximum depth of each tree varied within M {2, . . . , 8}. The number of trees averaged in the random forest function f was B {100, . . . , 900}.