reproducibilityindex.ai

Inter-node Hellinger Distance based Decision Tree

Authors: Pritom Saha Akash, Md. Eusha Kadir, Amin Ahsan Ali, Mohammad Shoyaib

IJCAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We perform an experiment over twenty balanced and twenty imbalanced datasets. The results show that decision trees based on i HD win against six other state-of-the-art methods on at least 14 balanced and 10 imbalanced datasets. We also observe that adding the weight to i HD improves the performance of decision trees on imbalanced datasets. Moreover, according to the result of the Friedman test, this improvement is statistically signiﬁcant compared to other methods.
Researcher Affiliation	Academia	Institute of Information Technology, University of Dhaka, Bangladesh 2Department of Computer Science & Engineering, Independent University, Bangladesh
Pseudocode	Yes	Algorithm 1 outlines the procedure of learning a binary DT using the proposed split criterion i HDw.
Open Source Code	No	The paper does not provide concrete access to source code (e.g., specific repository link, explicit code release statement) for the methodology described.
Open Datasets	Yes	Table 2 shows 40 datasets chosen from various areas like biology, medicine and ﬁnance. [...] These datasets are collected from two well-known public sources called UCI Machine Learning Repository [Dua and Graff, 2017] and KEEL Imbalanced Data Sets [Alcal a-Fdez et al., 2011].
Dataset Splits	Yes	We conduct 10-fold cross-validation on each dataset to get the unbiased result.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., exact GPU/CPU models, memory amounts) used for running its experiments.
Software Dependencies	No	The paper does not provide specific ancillary software details with version numbers (e.g., library or solver names with version numbers) needed to replicate the experiment.
Experiment Setup	Yes	For each dataset, we build eight unpruned DT classiﬁers based on i HD, i HDw, information gain (using both Entropy and Gini), Gain Ratio (GR) and, the splitting criteria proposed in DCSM, HDDT and CCPDT respectively. [...] We conduct 10-fold cross-validation on each dataset to get the unbiased result.