Learning Optimal Decision Trees with SAT

Authors: Nina Narodytska, Alexey Ignatiev, Filipe Pereira, Joao Marques-Silva

IJCAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We perform an experimental evalution on a set of benchmarks from [Bessiere et al., 2009; Olson et al., 2017]. We ran our experiments on Intel(R) Xeon(R) CPU 3.50GHz. The Glucose3 SAT solver [Audemard and Simon, 2009] was used, with a total timeout of 1000 sec in all experiments. The memory limit is 2GB. We also employed the ITI algorithm to produce decision trees from labeled examples [Utgoff et al., 1997a; Utgoff et al., 1997b].
Researcher Affiliation Collaboration 1 VMware Research, CA, USA 2 LASIGE, Faculdade de Ciˆencias, Universidade de Lisboa, Portugal 3 ISDCT SB RAS, Irkutsk, Russia
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks. Table 2 is a 'Description of variables', not an algorithm.
Open Source Code No The paper does not provide any explicit statements or links indicating that the source code for the described methodology is open or publicly available.
Open Datasets Yes We perform an experimental evalution on a set of benchmarks from [Bessiere et al., 2009; Olson et al., 2017]. We start with toy benchmarks from [Bessiere et al., 2009; Olson et al., 2017] that admit a small DT (less than 15 nodes). The paper also lists specific datasets in Tables 4, 5, and 6, such as 'Mouse', 'Weather', 'Car', 'Shuttle', 'Colic', 'Australian', 'Cancer', 'House Votes', etc.
Dataset Splits No The paper mentions a 'test set' but does not explicitly specify a distinct 'validation' set or its split percentage/methodology. It focuses on training and testing without detailing a separate validation phase.
Hardware Specification Yes We ran our experiments on Intel(R) Xeon(R) CPU 3.50GHz.
Software Dependencies No The paper mentions 'Glucose3 SAT solver [Audemard and Simon, 2009]', 'ITI algorithm [Utgoff et al., 1997a; Utgoff et al., 1997b]', and 'scikit-learn [Pedregosa and et al., 2011]' but does not provide specific version numbers for any of these software components.
Experiment Setup Yes The Glucose3 SAT solver [Audemard and Simon, 2009] was used, with a total timeout of 1000 sec in all experiments. The memory limit is 2GB. To find the optimal tree we perform a linear search on the total number of nodes starting from the upper bound provided by the ITI tree. We only consider trees with an odd number of nodes. We randomly sampled a fraction r of samples from datasets.