reproducibilityindex.ai

Analyzing Tree Architectures in Ensembles via Neural Tangent Kernel

Authors: Ryuichi Kanoh, Mahito Sugiyama

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We experimentally examined the effects of the degeneracy phenomenon discussed in Section 4.2. Setup. We used 90 classification tasks in the UCI database (Dua & Graff, 2017), each of which has fewer than 5000 data points as in (Arora et al., 2020). We performed kernel regression using the limiting NTK defined in Equation 5 and Equation 11, equivalent to the infinite ensemble of the perfect binary trees and decision lists. ... Performance. Figure 8 shows the averaged performance in classification accuracy on 90 datasets.
Researcher Affiliation	Academia	Ryuichi Kanoh1,2, Mahito Sugiyama1,2 1National Institute of Informatics 2The Graduate University for Advanced Studies, SOKENDAI
Pseudocode	No	No pseudocode or algorithm blocks were found.
Open Source Code	Yes	REPRODUCIBILITY STATEMENT Proofs are provided in the Appendix. For numerical experiments and figures, reproducible source codes are shared in the supplementary material.
Open Datasets	Yes	We used 90 classification tasks in the UCI database (Dua & Graff, 2017)
Dataset Splits	Yes	We report four-fold crossvalidation performance with random data splitting as in Arora et al. (2020) and Fern andez-Delgado et al. (2014).
Hardware Specification	Yes	We ran all experiments on 2.20 GHz Intel Xeon E5-2698 CPU and 252 GB of memory with Ubuntu Linux (version: 4.15.0-117-generic).
Software Dependencies	No	We used scikit-learn2 to perform kernel regression. We used scikit-learn3 for the implementation. (No specific version numbers are provided for these software components).
Experiment Setup	Yes	We used D in {2, 4, 8, 16, 32, 64, 128} and α in {1.0, 2.0, 4.0, 8.0, 16.0, 32.0}. The scaled error function is used as a decision function. To consider the ridge-less situation, regularization strength is fixed to 1.0 10 8. As for hyperparameters, we used max_depth in {2, 4, 6}, subsample in {0.6, 0.8, 1.0}, learning_rate in {0.1, 0.01, 0.001}, and n_estimators (the number of trees) in {100, 300, 500}.