PHOG: Probabilistic Model for Code
Authors: Pavol Bielik, Veselin Raychev, Martin Vechev
ICML 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We trained a PHOG model on a large Java Script code corpus and show that it is more precise than existing models, while similarly fast. |
| Researcher Affiliation | Academia | Pavol Bielik PAVOL.BIELIK@INF.ETHZ.CH Veselin Raychev VESELIN.RAYCHEV@INF.ETHZ.CH Martin Vechev MARTIN.VECHEV@INF.ETHZ.CH Department of Computer Science, ETH Z urich, Switzerland |
| Pseudocode | No | The paper describes a language (TCOND) and procedures, but does not include structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide an explicit statement or a link to the open-source code for the PHOG model or its implementation. |
| Open Datasets | Yes | In our experiments, we use a corpus of 150, 000 de-duplicated and non-obfuscated Java Script files from Git Hub (Raychev et al., 2016)1. 1http://www.srl.inf.ethz.ch/js150.php |
| Dataset Splits | No | The paper states “Two thirds of the data is used for training and the remaining one third is used only for evaluation,” but does not explicitly specify a separate validation dataset split. |
| Hardware Specification | Yes | Experiments were done on a 32-core 2.13 GHz Xeon E7-4830 server with 256GB RAM and running Ubuntu 14.04. |
| Software Dependencies | No | The paper mentions “Ubuntu 14.04” and “Acorn parser” but does not provide specific version numbers for software dependencies related to their PHOG implementation or libraries used. |
| Experiment Setup | Yes | We instantiate Ω(p) to return the number of instructions. and Overall, this search procedure explores 20, 000 functions out of which the best one is selected. and it mentions modified Kneser-Ney smoothing and Witten-Bell interpolation smoothing. |