Maximal Divergence Sequential Autoencoder for Binary Software Vulnerability Detection
Authors: Tue Le, Tuan Nguyen, Trung Le, Dinh Phung, Paul Montague, Olivier De Vel, Lizhen Qu
ICLR 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conducted extensive experiments to compare and contrast our proposed methods with the baselines, and the results indicate that our proposed methods outperform the baselines in all performance measures of interest. |
| Researcher Affiliation | Collaboration | Tue Le , Tuan Nguyen AI Research Lab, Trusting Social, Australia {tue.le, tuan.nguyen}@trustingsocial.com Trung Le, Dinh Phung Monash University, Australia {trunglm, dinh.phung}@monash.edu Paul Montague, Olivier De Vel Defence Science and Technology Group, Department of Defence, Australia {paul.montague, olivier.devel}@dst.defence.gov.au Lizhen Qu Data61, CSIRO, Australia lizhen.qu@data60.csiro.au |
| Pseudocode | No | No pseudocode or explicit algorithm blocks found. |
| Open Source Code | Yes | The source code, as well as the dataset, is available in our Git Hub repository4. 4https://github.com/dascimal-org/MDSeq VAE |
| Open Datasets | Yes | One of our most significant contributions is to create a labeled dataset for use in binary code vulnerability detection. ... The source code, as well as the dataset, is available in our Git Hub repository4. 4https://github.com/dascimal-org/MDSeq VAE |
| Dataset Splits | Yes | We split the data into 80% for training, 10% for validation, and the remaining 10% for testing. |
| Hardware Specification | Yes | We ran our experiments on a computer with an Intel Xeon Processor E5-1660 which had 8 cores at 3.0 GHz and 128 GB of RAM. |
| Software Dependencies | No | We implemented our proposed method in Python using Tensorflow (Abadi et al., 2016), an open-source software library for Machine Intelligence developed by the Google Brain Team. |
| Experiment Setup | Yes | For the RNN baselines and our models, the size of hidden unit was set to 256. For our model, the size of the latent space was set to 4,096, the trade-off parameters α, β were set to 2 10 2 and 10 4 respectively. We used the Adam optimizer (Kingma & Ba, 2014) with an initial learning rate equal to 0.0001. The minibatch size was set to 64 and the number of epochs was set to 100. |