Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Learning Decision Trees and Forests with Algorithmic Recourse
Authors: Kentaro Kanamori, Takuya Takagi, Ken Kobayashi, Yuichi Ike
ICML 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results demonstrated that our method successfully provided reasonable actions to more instances than the baselines without significantly degrading accuracy and computational efficiency. |
| Researcher Affiliation | Collaboration | 1Fujitsu Limited, Japan 2Tokyo Institute of Technology, Japan 3Kyushu University, Japan. |
| Pseudocode | Yes | Algorithm 1 presents an algorithm for the problem (2). ... Algorithm 2 presents a pseudo-code of the actionable feature tweaking algorithm. ... Algorithm 3 presents a greedy approximation algorithm for the problem. |
| Open Source Code | Yes | All the code was implemented in Python 3.10 with Numba 0.56.4 and is available at https://github.com/kelicht/ract. |
| Open Datasets | Yes | We used four real datasets: FICO (N = 9871, D = 23) (FICO et al., 2018), COMPAS (N = 6167, D = 14) (Angwin et al., 2016), Credit (N = 30000, D = 16) (Yeh & hui Lien, 2009), and Bail (N = 8923, D = 16) (Schmidt & Witte, 1988). |
| Dataset Splits | Yes | We conducted 10-fold cross validation, and measured (i) the average accuracy and AUC on the test set, (ii) the average recourse ratio, which is defined as the ratio of the test instances that are guaranteed valid actions whose costs are less than ε = 0.3, and (iii) the average running time. |
| Hardware Specification | Yes | All the experiments were conducted on mac OS Monterey with Apple M1 Pro CPU and 32 GB memory. |
| Software Dependencies | Yes | All the code was implemented in Python 3.10 with Numba 0.56.4 and is available at https://github.com/kelicht/ract. |
| Experiment Setup | Yes | For the baselines and our RACT, we trained classification trees with a maximum depth of 64 and random forest classifiers with T = 200 classification trees. For each dataset, we determined the hyper-parameters δ and λ of our RACT based on the results of our trade-off analyses in Section 4.3. ... Table 5. Details of hyper-parameter tuning for our RACT. We varied the values of δ and λ in the predefined ranges (Range), and determined each value (Select) based on the results. |