Neural Tangent Kernels for Axis-Aligned Tree Ensembles
Authors: Ryuichi Kanoh, Mahito Sugiyama
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our numerical experiments show a variety of suitable features depending on the type of constraints. Our NTK analysis highlights both the theoretical and practical impacts of the axis-aligned constraint in tree ensemble learning. |
| Researcher Affiliation | Academia | 1National Institute of Informatics 2The Graduate University for Advanced Studies, SOKENDAI. |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | The implementation we used in our numerical experiments is available online1. 1https://github.com/ryuichi0704/aa-tntk |
| Open Datasets | Yes | We use Easy MKL (Aiolli & Donini, 2015), a convex approach that identifies kernel combinations maximizing the margin between classes. Figure 6 displays the weights obtained by Easy MKL on the entire tic-tac-toe dataset preprocessed by Fernández-Delgado et al. (2014). [...] In this experiment, we used the diabetes dataset2, a commonly used real-world dataset for regression tasks... 2https://archive.ics.uci.edu/dataset/34/diabetes |
| Dataset Splits | Yes | Figure 7 displays the results of four-fold cross-validation, where 25 percent of the total amount of data were used for training and the remainder for evaluation. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory amounts) used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific ancillary software details with version numbers (e.g., library or solver names with version numbers). |
| Experiment Setup | Yes | We set α = 2.0 and β = 0.5. [...] The models with M = 16 and 1024 are trained using fullbatch gradient descent with a learning rate of 0.1. [...] Kernel parameters were set with α in {0.5, 1.0, 2.0, 4.0} and β in {0.1, 0.5, 1.0}. We used the regularization strength C = 1.0 in SVMs. For RF/GBDT, the number of weak learners is set to 1000. |