Learning concise representations for regression by evolving networks of trees

Authors: William La Cava, Tilak Raj Singh, James Taggart, Srinivas Suri, Jason H. Moore

ICLR 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We benchmark these variants on 100 open-source regression problems in comparison to state-of-the-art machine learning approaches.
Researcher Affiliation Academia Institute for Biomedical Informatics University of Pennsylvania {lacava, moore}@upenn.edu, {tilakraj, jtagg, surisr}@seas.upenn.edu
Pseudocode No The paper describes the method using numbered steps (e.g., '1. Fit a linear model...', '2. While the stop criterion is not met:'), but these are not explicitly labeled as 'Pseudocode' or 'Algorithm' blocks.
Open Source Code Yes 1http://github.com/lacava/feat
Open Datasets Yes For the regression datasets, we use 100 real-world and simulated datasets available from Open ML (Vanschoren et al., 2014). [...] We use the standardized versions of the datasets available in the Penn Machine Learning Benchmark repository (Olson et al., 2017).
Dataset Splits Yes For each method, we use grid search to tune the hyperparameters with 10-fold cross validation (CV).
Hardware Specification No Runs are conducted in a heterogenous computing environment, with one core assigned to each CV training per dataset. No specific hardware models (CPU/GPU) are provided.
Software Dependencies No The paper mentions using 'implementations from scikit-learn' and 'Adam (Kingma & Ba, 2014)' but does not provide specific version numbers for these or any other software dependencies.
Experiment Setup Yes Table 2: Comparison methods and their hyperparameters. Tuned values denoted with brackets. Method Setting Value FEAT Population size 500 Termination criterion 200 generations, 60 minutes, or 50 iterations of stalled median validation loss Max depth 10 Max dimensionality 50 Objectives {(MSE,C), (MSE,C,Corr),(MSE,C,CN) } Feedback (f) { 0.25, 0.5, 0.75 } Crossover/mutation ratio { 0.25, 0.5, 0.75 } Batch size 1000 Learning rate (initial) 0.1 SGD iterations / individual / generation 10 MLP Optimizer {LBFGS, Adam (Kingma & Ba, 2014)} Hidden Layers {1,3,6} Neurons {(100,), (100,50,10), (100,50,20,10,10,8)} Learning rate (initial) {1e-4, 1e-3, 1e-2} Activation {logistic, tanh, relu} Regularization L2, α = {1e-5, 1e-4, 1e-3} Max Iterations 10000 Early Stopping True XGBoost Number of estimators {10, 100, 200, 500, 1000} Max depth {3, 4, 5, 6, 7} Min split loss (γ) {1e-3,1e-2,0.1,1,10,1e2,1e3} Learning rate {0, 0.01, . . . , 1.0 } Random Forest Number of estimators {10, 100, 1000} Min weight fraction leaf { 0.0, 0.25, 0.5 } Kernel Ridge Kernel Radial basis function Regularization (α) { 1e-3, 1e-2, 0.1, 1 } Kernel width (γ) { 1e-2, 0.1, 1, 10, 100 } Elastic Net l1-l2 ratio { 0, 0.01, . . . , 1.0 } selection { cyclic, random }