Random Composite Forests
Authors: Giulia DeSalvo, Mehryar Mohri
AAAI 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Using our theoretical analysis, we devise a new algorithm, RANDOMCOMPOSITEFOREST (RCF), that is based on forming an ensemble of random composite trees. We report the results of experiments demonstrating that RCF yields significant performance improvements over both Random Forests and a variant of RCF in several tasks. |
| Researcher Affiliation | Collaboration | Giulia De Salvo Courant Institute of Mathematical Sciences 251 Mercer St. New York, NY 10012 desalvo@cims.nyu.edu Mehryar Mohri Courant Institute and Google Research 251 Mercer St. New York, NY 10012 mohri@cs.nyu.edu |
| Pseudocode | Yes | Algorithm 1 RANDOMCOMPOSITEFOREST(B, r, γ) |
| Open Source Code | No | We implemented RCF, RFs, and RF-SVM by using scikit-learn, (Pedregosa et al.). The paper does not provide its own source code for the described methodology. |
| Open Datasets | Yes | We tested RCF on eleven datasets from UCI s data repository: german, vehicle, vowel, dna, pendigits, iris, abalone, and a2a. |
| Dataset Splits | Yes | For each dataset, we randomly divided the data into training, validation, and test sets in order to run the RCF algorithm. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., CPU/GPU models, memory) used for running its experiments. |
| Software Dependencies | No | We implemented RCF, RFs, and RF-SVM by using scikit-learn, (Pedregosa et al.). While 'scikit-learn' is mentioned, no version number is provided. |
| Experiment Setup | Yes | For the SVM algorithm at the leaves of each composite tree, we allowed the set of polynomial degrees G = {1, . . . , 9}. The number of different sequences of degree values was p = 10. For each polynomial degree δ G, the regularization parameter Cδ {10i : i = 3, . . . , 2} of SVMs was selected via cross-validation and at each leaf k, it was simply scaled by mk m Cδ. The r parameter which determined the size of the subset of features was in the following range: r {1, |F|, . . . , |F|}. The γ parameter that rescales the bound was γ {10i : i = 3, . . . , 0} and the maximum depth of each tree varied within M {2, . . . , 8}. The number of trees averaged in the random forest function f was B {100, . . . , 900}. |