IPBoost – Non-Convex Boosting via Integer Programming
Authors: Marc Pfetsch, Sebastian Pokutta
ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We report results that are comparable to or better than the current state-of-the-art. We present computational results demonstrating that IP-based boosting can avoid the bad examples of (Long & Servedio, 2008): by far better solutions can be obtained via LP/IP-based boosting for these instances. We also show that IP-based boosting can be competitive for real-world instances from the LIBSVM data set. |
| Researcher Affiliation | Academia | 1Department of Mathematics, TU Darmstadt, Germany 2Department of Mathematics, TU Berlin and Zuse Institute Berlin, Berlin, Germany. |
| Pseudocode | Yes | Algorithm 1 IPBoost |
| Open Source Code | Yes | The code is available through the web pages of the authors. |
| Open Datasets | Yes | We use classification instances from the LIBSVM data sets available at https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/. |
| Dataset Splits | No | Note that we randomly split off 20 % of the points for the test set, and recall that we report the averages of 10 runs. For the other 24, we randomly split off 20% of the points as a test set. The paper describes train and test splits, but does not explicitly detail a separate validation split. |
| Hardware Specification | Yes | All tests were run on a Linux cluster with Intel Xeon quad core CPUs with 3.50GHz, 10 MB cache, and 32 GB of main memory. |
| Software Dependencies | Yes | We used a prerelease version of SCIP 7.0.0 with So Plex 5.0.0 as LP-solver (Gamrath et al., 2020) and the python framework scikit-learn (Pedregosa et al., 2011) and Ada Boost implementation in version 0.21.3 of scikit-learn. |
| Experiment Setup | Yes | We use the decision tree implementation of scikit-learn with a maximal depth of 1, i.e., a decision stump, as base learners for all boosters. We performed 10 runs for each instance with varying random seeds and Note that we use a time limit of one hour for each run of IPBoost. subsampling 30 000 points if their number N is larger than this threshold. Another crucial choice in our approach is the margin bound ρ. We ran our code with different values the aggregated results are presented in Table 2. |