More Accurate Learning of k-DNF Reference Classes

Authors: Brendan Juba, Hengxuan Li4385-4393

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We include an empirical demonstration that our large formula algorithm can be used in practice for a similar application, explaining the decisions of classifiers on a given point. We find that our algorithm can compete with the previous algorithms for this task (Ribeiro, Singh, and Guestrin 2016; 2018) while providing theoretical guarantees. Similarly, the small formula algorithm of Juba et al. (2018) was used for the conditional linear regression experiments of Hainline et al. (2019), and found to be effective in practice.
Researcher Affiliation Collaboration Brendan Juba Washington University in St. Louis bjuba@wustl.edu Hengxuan Li Facebook lhx@fb.com
Pseudocode Yes Algorithm 1: Partial Greedy Algorithm; Algorithm 2: Low Deg Partial(X)
Open Source Code Yes Code for the experiments can be found at https://github.com/lihengxuan-wustl/Refclass-KDNF.
Open Datasets No The paper mentions using the "Lending dataset" and states "using the same settings as Ribeiro et al.", implying the dataset source. However, it does not provide a direct link, DOI, or a formal citation with author and year for explicit public access to the Lending dataset itself.
Dataset Splits Yes using the same settings as Ribeiro et al. we split the Lending dataset into three parts: a training set with 5635 examples, and a validation set and test set of 1134 examples each.
Hardware Specification No The paper mentions "Our single-threaded Cython implementation takes about two days to compute a reference class on this data set." but does not specify any hardware details like CPU, GPU, or memory.
Software Dependencies No The paper mentions "Cython implementation" but does not specify version numbers for Cython or any other software dependencies.
Experiment Setup No The paper mentions training three models (logistic regression, gradient boosted trees, multilayer perceptron) and using 3-DNFs rules, and also notes using "the same transformation of the real-valued and categorical features to Boolean attributes as used by Ribeiro et al." However, it does not provide specific hyperparameter values or detailed training configurations (e.g., learning rates, batch sizes, epochs) for the models trained.