reproducibilityindex.ai

Learning Interpretable Decision Rule Sets: A Submodular Optimization Approach

Authors: Fan Yang, Kai He, Linxiao Yang, Hongxia Du, Jingbang Yang, Bo Yang, Liang Sun

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments on real datasets demonstrate the effectiveness of our method. [...] Our experimental study is conducted on 20 public datasets.
Researcher Affiliation	Industry	Fan Yang, Kai He, Linxiao Yang, Hongxia Du, Jingbang Yang, Bo Yang, Liang Sun DAMO Academy, Alibaba Group, Hangzhou, China {fanyang.yf,kai.he,linxiao.ylx,hongxia.dhx,jingbang.yjb,muhai.yb,liang.sun} @alibaba-inc.com
Pseudocode	Yes	Algorithm 1 Rule set learning. [...] Algorithm 2 DS-OPT(R, u, w). [...] Algorithm 3 Local combinatorial search.
Open Source Code	No	The paper references third-party implementations and code (e.g., for BRS and RIPPER) but does not provide open-source code for its own described methodology.
Open Datasets	Yes	Our experimental study is conducted on 20 public datasets. Fifteen of them are from the UCI repository [16], and the other ﬁve are variants of the Pro Publica recidivism dataset (COMPAS) [29] and the Fair Isaac credit risk dataset (FICO) [21]. [...] [16] Dheeru Dua and Casey Graff. UCI machine learning repository, 2017. URL http://archive. ics.uci.edu/ml. [...] [29] Jeff Larson, Surya Mattu, Lauren Kirchner, and Julia Angwin. How we analyzed the compas recidivism algorithm. Pro Publica, 2016. [...] [21] FICO, Google, Imperial College London, MIT, University of Oxford, UC Irvine, and UC Berkeley. Explainable machine learning challenge, 2018. URL https://community.fico.com/ s/explainable-machine-learning-challenge.
Dataset Splits	Yes	We estimate numerical results based on 10-fold stratiﬁed cross-validation (CV). In each CV fold, we use grid search to optimize the hyperparameters of each algorithm on the training split.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments. It only mentions general concepts like "Bit vectors are used in our implementation to process a large number of samples efﬁciently." and "Scalability test."
Software Dependencies	No	The paper mentions software such as "scikit-learn package [40]" and
Experiment Setup	Yes	For the method proposed in this paper, we ﬁx β0 = β1 = 1 and optimize the remaining hyperparameters β2 {0.5, 0.1, 0.01}, λ {0.1, 1, 4, 8, 16, 64} and K {8, 16, 32}. The hyperparameters of CG include the strength of complexity penalty and the beam width, for which we sweep in {0.001, 0.002, 0.005} and {10, 20}, respectively. For RIPPER, the proportion of training set used for pruning is varied in {0.2, 0.25, . . . , 0.6}. For BRS, the maximum length of a rule is chosen from {3, 5}. For CART and RF, we tune the minimum number of samples at leaf nodes from 1 to 100 and ﬁx the number of trees in RF to be 100.