reproducibilityindex.ai

More Accurate Learning of k-DNF Reference Classes

Authors: Brendan Juba, Hengxuan Li4385-4393

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We include an empirical demonstration that our large formula algorithm can be used in practice for a similar application, explaining the decisions of classiﬁers on a given point. We ﬁnd that our algorithm can compete with the previous algorithms for this task (Ribeiro, Singh, and Guestrin 2016; 2018) while providing theoretical guarantees. Similarly, the small formula algorithm of Juba et al. (2018) was used for the conditional linear regression experiments of Hainline et al. (2019), and found to be effective in practice.
Researcher Affiliation	Collaboration	Brendan Juba Washington University in St. Louis bjuba@wustl.edu Hengxuan Li Facebook lhx@fb.com
Pseudocode	Yes	Algorithm 1: Partial Greedy Algorithm; Algorithm 2: Low Deg Partial(X)
Open Source Code	Yes	Code for the experiments can be found at https://github.com/lihengxuan-wustl/Refclass-KDNF.
Open Datasets	No	The paper mentions using the "Lending dataset" and states "using the same settings as Ribeiro et al.", implying the dataset source. However, it does not provide a direct link, DOI, or a formal citation with author and year for explicit public access to the Lending dataset itself.
Dataset Splits	Yes	using the same settings as Ribeiro et al. we split the Lending dataset into three parts: a training set with 5635 examples, and a validation set and test set of 1134 examples each.
Hardware Specification	No	The paper mentions "Our single-threaded Cython implementation takes about two days to compute a reference class on this data set." but does not specify any hardware details like CPU, GPU, or memory.
Software Dependencies	No	The paper mentions "Cython implementation" but does not specify version numbers for Cython or any other software dependencies.
Experiment Setup	No	The paper mentions training three models (logistic regression, gradient boosted trees, multilayer perceptron) and using 3-DNFs rules, and also notes using "the same transformation of the real-valued and categorical features to Boolean attributes as used by Ribeiro et al." However, it does not provide specific hyperparameter values or detailed training configurations (e.g., learning rates, batch sizes, epochs) for the models trained.