Towards Balanced Defect Prediction with Better Information Propagation
Authors: Xianda Zheng, Yuan-Fang Li, Huan Gao, Yuncheng Hua, Guilin Qi759-767
AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results on real-world benchmark datasets show that DPCAG improves performance compare to the state-of-the-art models. |
| Researcher Affiliation | Collaboration | 1School of Cyber Science and Engineering, Southeast University, Nanjing, China 2Faculty of Information Technology, Monash University, Melbourne, Australia 3Microsoft Asia-Pacific Research and Development Group, Suzhou, China 4School of Computer Science and Engineering, Southeast University, Nanjing, China 5Key Laboratory of Computer Network and Information Integration, Southeast University, Nanjing, China |
| Pseudocode | Yes | Algorithm 1: training DPCAG classifier |
| Open Source Code | No | The paper does not provide any explicit statement about releasing source code or a link to a code repository for the described methodology. |
| Open Datasets | Yes | We use three datasets from ELFF (Shippey et al. 2016), namely Dr Java, Genoviz and Jmol, to evaluate model performance. |
| Dataset Splits | Yes | For each dataset, the set of labeled nodes is divided into training, validation and test sets with the ratio of 90-5-5. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware used for running the experiments (e.g., CPU, GPU models, or cloud computing specifications). |
| Software Dependencies | No | The paper mentions using RMSProp as the optimizer but does not specify its version or any other software dependencies (e.g., programming languages, libraries, or frameworks) with their version numbers. |
| Experiment Setup | Yes | For our model, the learning rate lr is set to 0.006 and the dropout rate p set to 0.5. The hyperparameter a is set to 25 for all three datasets. The hyperparameter b is set to 1300 in Dr Java and 2000 in both Genoviz and Jmol. The dimension of hidden layer is h = 20. We use RMSProp (Tieleman and Hinton 2012) as the optimizer. The confidence threshold for adding nodes to the training set is t = 0.8. In every iteration, the epoch for the Expectation step and the Maximization step is set to 200. We select the values of hyperparameters according to the result of validation set. |