Inter-node Hellinger Distance based Decision Tree
Authors: Pritom Saha Akash, Md. Eusha Kadir, Amin Ahsan Ali, Mohammad Shoyaib
IJCAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We perform an experiment over twenty balanced and twenty imbalanced datasets. The results show that decision trees based on i HD win against six other state-of-the-art methods on at least 14 balanced and 10 imbalanced datasets. We also observe that adding the weight to i HD improves the performance of decision trees on imbalanced datasets. Moreover, according to the result of the Friedman test, this improvement is statistically significant compared to other methods. |
| Researcher Affiliation | Academia | Institute of Information Technology, University of Dhaka, Bangladesh 2Department of Computer Science & Engineering, Independent University, Bangladesh |
| Pseudocode | Yes | Algorithm 1 outlines the procedure of learning a binary DT using the proposed split criterion i HDw. |
| Open Source Code | No | The paper does not provide concrete access to source code (e.g., specific repository link, explicit code release statement) for the methodology described. |
| Open Datasets | Yes | Table 2 shows 40 datasets chosen from various areas like biology, medicine and finance. [...] These datasets are collected from two well-known public sources called UCI Machine Learning Repository [Dua and Graff, 2017] and KEEL Imbalanced Data Sets [Alcal a-Fdez et al., 2011]. |
| Dataset Splits | Yes | We conduct 10-fold cross-validation on each dataset to get the unbiased result. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., exact GPU/CPU models, memory amounts) used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific ancillary software details with version numbers (e.g., library or solver names with version numbers) needed to replicate the experiment. |
| Experiment Setup | Yes | For each dataset, we build eight unpruned DT classifiers based on i HD, i HDw, information gain (using both Entropy and Gini), Gain Ratio (GR) and, the splitting criteria proposed in DCSM, HDDT and CCPDT respectively. [...] We conduct 10-fold cross-validation on each dataset to get the unbiased result. |