Learning Interpretable Decision Rule Sets: A Submodular Optimization Approach
Authors: Fan Yang, Kai He, Linxiao Yang, Hongxia Du, Jingbang Yang, Bo Yang, Liang Sun
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on real datasets demonstrate the effectiveness of our method. [...] Our experimental study is conducted on 20 public datasets. |
| Researcher Affiliation | Industry | Fan Yang, Kai He, Linxiao Yang, Hongxia Du, Jingbang Yang, Bo Yang, Liang Sun DAMO Academy, Alibaba Group, Hangzhou, China {fanyang.yf,kai.he,linxiao.ylx,hongxia.dhx,jingbang.yjb,muhai.yb,liang.sun} @alibaba-inc.com |
| Pseudocode | Yes | Algorithm 1 Rule set learning. [...] Algorithm 2 DS-OPT(R, u, w). [...] Algorithm 3 Local combinatorial search. |
| Open Source Code | No | The paper references third-party implementations and code (e.g., for BRS and RIPPER) but does not provide open-source code for its own described methodology. |
| Open Datasets | Yes | Our experimental study is conducted on 20 public datasets. Fifteen of them are from the UCI repository [16], and the other five are variants of the Pro Publica recidivism dataset (COMPAS) [29] and the Fair Isaac credit risk dataset (FICO) [21]. [...] [16] Dheeru Dua and Casey Graff. UCI machine learning repository, 2017. URL http://archive. ics.uci.edu/ml. [...] [29] Jeff Larson, Surya Mattu, Lauren Kirchner, and Julia Angwin. How we analyzed the compas recidivism algorithm. Pro Publica, 2016. [...] [21] FICO, Google, Imperial College London, MIT, University of Oxford, UC Irvine, and UC Berkeley. Explainable machine learning challenge, 2018. URL https://community.fico.com/ s/explainable-machine-learning-challenge. |
| Dataset Splits | Yes | We estimate numerical results based on 10-fold stratified cross-validation (CV). In each CV fold, we use grid search to optimize the hyperparameters of each algorithm on the training split. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments. It only mentions general concepts like "Bit vectors are used in our implementation to process a large number of samples efficiently." and "Scalability test." |
| Software Dependencies | No | The paper mentions software such as "scikit-learn package [40]" and |
| Experiment Setup | Yes | For the method proposed in this paper, we fix β0 = β1 = 1 and optimize the remaining hyperparameters β2 {0.5, 0.1, 0.01}, λ {0.1, 1, 4, 8, 16, 64} and K {8, 16, 32}. The hyperparameters of CG include the strength of complexity penalty and the beam width, for which we sweep in {0.001, 0.002, 0.005} and {10, 20}, respectively. For RIPPER, the proportion of training set used for pruning is varied in {0.2, 0.25, . . . , 0.6}. For BRS, the maximum length of a rule is chosen from {3, 5}. For CART and RF, we tune the minimum number of samples at leaf nodes from 1 to 100 and fix the number of trees in RF to be 100. |