reproducibilityindex.ai

Towards Balanced Defect Prediction with Better Information Propagation

Authors: Xianda Zheng, Yuan-Fang Li, Huan Gao, Yuncheng Hua, Guilin Qi759-767

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results on real-world benchmark datasets show that DPCAG improves performance compare to the state-of-the-art models.
Researcher Affiliation	Collaboration	1School of Cyber Science and Engineering, Southeast University, Nanjing, China 2Faculty of Information Technology, Monash University, Melbourne, Australia 3Microsoft Asia-Paciﬁc Research and Development Group, Suzhou, China 4School of Computer Science and Engineering, Southeast University, Nanjing, China 5Key Laboratory of Computer Network and Information Integration, Southeast University, Nanjing, China
Pseudocode	Yes	Algorithm 1: training DPCAG classiﬁer
Open Source Code	No	The paper does not provide any explicit statement about releasing source code or a link to a code repository for the described methodology.
Open Datasets	Yes	We use three datasets from ELFF (Shippey et al. 2016), namely Dr Java, Genoviz and Jmol, to evaluate model performance.
Dataset Splits	Yes	For each dataset, the set of labeled nodes is divided into training, validation and test sets with the ratio of 90-5-5.
Hardware Specification	No	The paper does not provide any specific details about the hardware used for running the experiments (e.g., CPU, GPU models, or cloud computing specifications).
Software Dependencies	No	The paper mentions using RMSProp as the optimizer but does not specify its version or any other software dependencies (e.g., programming languages, libraries, or frameworks) with their version numbers.
Experiment Setup	Yes	For our model, the learning rate lr is set to 0.006 and the dropout rate p set to 0.5. The hyperparameter a is set to 25 for all three datasets. The hyperparameter b is set to 1300 in Dr Java and 2000 in both Genoviz and Jmol. The dimension of hidden layer is h = 20. We use RMSProp (Tieleman and Hinton 2012) as the optimizer. The conﬁdence threshold for adding nodes to the training set is t = 0.8. In every iteration, the epoch for the Expectation step and the Maximization step is set to 200. We select the values of hyperparameters according to the result of validation set.