reproducibilityindex.ai

Efficient Non-greedy Optimization of Decision Trees

Authors: Mohammad Norouzi, Maxwell Collins, Matthew A. Johnson, David J. Fleet, Pushmeet Kohli

NeurIPS 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments on several classiﬁcation benchmarks demonstrate that the resulting non-greedy decision trees outperform greedy decision tree baselines. Experiments are conducted on several benchmark datasets from Lib SVM [6] for multi-class classiﬁcation, namely Sens IT, Connect4, Protein, and MNIST.
Researcher Affiliation	Collaboration	1,4 Department of Computer Science, University of Toronto 2 Department of Computer Science, University of Wisconsin-Madison 3,5 Microsoft Research
Pseudocode	Yes	Algorithm 1 Stochastic gradient descent (SGD) algorithm for non-greedy decision tree learning.
Open Source Code	No	The paper does not provide concrete access to source code.
Open Datasets	Yes	Experiments are conducted on several benchmark datasets from Lib SVM [6] for multi-class classiﬁcation, namely Sens IT, Connect4, Protein, and MNIST.
Dataset Splits	Yes	We use the provided train; validation; test sets when available. If such splits are not provided, we use a random 80%/20% split of the training data for train; validation and a random 64%/16%/20% split for train; validation; test sets.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., CPU/GPU models, memory) used for running its experiments.
Software Dependencies	No	The paper mentions "Lib SVM" as a source for datasets but does not provide specific version numbers for any software dependencies used in the implementation (e.g., Python, PyTorch, TensorFlow, etc.).
Experiment Setup	Yes	The hyper-parameters for each method are tuned for each depth independently. We train multiple trees for different values of ν {0.1, 1, 4, 10, 43, 100}, and we pick the value of ν that produces the tree with minimum validation error. We also tune the choice of the SGD learning rate, η, in this step. To build non-greedy trees, we initially build an axis-aligned tree with split functions that threshold a single feature optimized using conventional procedures that maximize information gain. The axisaligned split is used to initialize a greedy variant of the tree training procedure called CO2 [18]. This provides initial values for W and Θ for the non-greedy procedure.