Explaining Random Forests Using Bipolar Argumentation and Markov Networks
Authors: Nico Potyka, Xiang Yin, Francesca Toni
AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | As the computational complexity of the problems is high, we consider a probabilistic algorithm to approximate reasons and present first experimental results. We tested our algorithm on three datasets. The Iris and PIMA dataset are continuous datasets that have been considered for counterfactual explanations (White and d Avila Garcez 2020). In addition, we consider the Mushroom dataset that contains discrete features. |
| Researcher Affiliation | Academia | Department of Computing, Imperial College London, London, UK {n.potyka, x.yin20, f.toni}@imperial.ac.uk |
| Pseudocode | Yes | Figure 2: Probabilistic approximation algorithm for estimating the percentage of non-ambiguous inputs, and the probabilities of sufficient and necessary queries. |
| Open Source Code | Yes | 1https://github.com/nicopotyka/Uncertainpy, folder examples/explanations/random Forests. |
| Open Datasets | Yes | We tested our algorithm on three datasets. The Iris and PIMA dataset are continuous datasets that have been considered for counterfactual explanations (White and d Avila Garcez 2020). In addition, we consider the Mushroom dataset that contains discrete features. For reproducibility, the datasets are contained in the source folder. |
| Dataset Splits | No | The paper mentions using datasets for testing but does not explicitly specify training, validation, and test splits with percentages or sample counts, nor does it refer to predefined splits with citations. |
| Hardware Specification | Yes | We generated 10,000 samples for the first stage in less than one minute on a Windows laptop with i711800H CPU and 16 GB RAM. |
| Software Dependencies | No | The paper states it was implemented in Python but does not provide specific version numbers for Python or any other libraries or software dependencies. |
| Experiment Setup | Yes | We chose δ = 0.9. Our implementation works in two stages. The first stage is analogous to Figure 2 and the queries are the atomic sufficient and necessary queries of the form (Uy | Ui) and (Ui | Uy) for all combinations of feature arguments Ui and class arguments Uy. [...] For every pair (ui, uj), the probability can be estimated quickly. However, since there can be a large number of pairs, the overall runtime can be long and the almost sufficient reasons of size 2 are reported continuously while the sampling procedure is running. We generated 10,000 samples for the first stage in less than one minute. |