reproducibilityindex.ai

Learning Optimal Decision Trees with MaxSAT and its Integration in AdaBoost

Authors: Hao Hu, Mohamed Siala, Emmanuel Hebrard, Marie-José Huguet

IJCAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We perform three experiments on datasets from CP4IM1. The ﬁrst experiment aims at highlighting the overﬁtting behaviour of the SAT approach. In the second experiment we compare the performance with the state of the art heuristic (CART) and exact (DL8.5) methods. Finally, we evaluate the Max SAT implementation of Boosting and Bagging. The datasets are binarized with the one-hot-encoding.
Researcher Affiliation	Academia	LAAS-CNRS, Universit e de Toulouse, CNRS, INSA, Toulouse, France1
Pseudocode	No	The paper describes methods and formulas using mathematical equations and textual explanations but does not include any explicit pseudocode or algorithm blocks.
Open Source Code	Yes	Our code source and datasets are available online at https://gepgitlab.laas.fr/hhu/maxsat-decision-trees.
Open Datasets	Yes	We perform three experiments on datasets from CP4IM1. The datasets are binarized with the one-hot-encoding. 1https://dtai.cs.kuleuven.be/CP4IM/datasets/
Dataset Splits	Yes	We use the hold-out method to split training set and testing set for all datasets used. Following [Narodytska et al., 2018], we choose (only in this experiment) three (small) different splitting ratios r = {0.05, 0.1, 0.2} to generate training sets. And remaining examples are used for testing. This process is repeated 10 times using different randomisation seeds. We use stratiﬁed sampling to preserve the class distribution with 5-fold crossvalidation. This process is repeated with 10 different randomisation seeds for each dataset. We use the hold-out method with ratio r = 0.8 to split each dataset into training and testing set, which is repeated with 10 different randomisation seeds.
Hardware Specification	Yes	We ran all experiments using Xeon E5-2695 v3 @ 2.30GHz CPU and running x Ubuntu 16.04.6 LTS.
Software Dependencies	Yes	We use the RC2 Max SAT solver [Ignatiev et al., 2019]. We use the Max SAT solver Loandra [Berg et al., 2019; Demirovi c and Stuckey, 2019]... DL8.5 [Aglin et al., 2020] as a state-of-the-art exact approach via its python package (version 0.0.9).
Experiment Setup	Yes	Their parameters are kept to their default values, except for the maximum depth which is a ﬁxed parameter for all the models. We choose three different maximum depths d {2, 3, 4} for CART and DL8.5. The timeout for training is set to 15 minutes and the memory limit to 16GB for all the experiments in this section. The core guided phase in Loandra is limited to 10 minutes. The number of learners (for boosting and bagging) is set yo be 21 as reasonable.