reproducibilityindex.ai

PHOG: Probabilistic Model for Code

Authors: Pavol Bielik, Veselin Raychev, Martin Vechev

ICML 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We trained a PHOG model on a large Java Script code corpus and show that it is more precise than existing models, while similarly fast.
Researcher Affiliation	Academia	Pavol Bielik PAVOL.BIELIK@INF.ETHZ.CH Veselin Raychev VESELIN.RAYCHEV@INF.ETHZ.CH Martin Vechev MARTIN.VECHEV@INF.ETHZ.CH Department of Computer Science, ETH Z urich, Switzerland
Pseudocode	No	The paper describes a language (TCOND) and procedures, but does not include structured pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide an explicit statement or a link to the open-source code for the PHOG model or its implementation.
Open Datasets	Yes	In our experiments, we use a corpus of 150, 000 de-duplicated and non-obfuscated Java Script files from Git Hub (Raychev et al., 2016)1. 1http://www.srl.inf.ethz.ch/js150.php
Dataset Splits	No	The paper states “Two thirds of the data is used for training and the remaining one third is used only for evaluation,” but does not explicitly specify a separate validation dataset split.
Hardware Specification	Yes	Experiments were done on a 32-core 2.13 GHz Xeon E7-4830 server with 256GB RAM and running Ubuntu 14.04.
Software Dependencies	No	The paper mentions “Ubuntu 14.04” and “Acorn parser” but does not provide specific version numbers for software dependencies related to their PHOG implementation or libraries used.
Experiment Setup	Yes	We instantiate Ω(p) to return the number of instructions. and Overall, this search procedure explores 20, 000 functions out of which the best one is selected. and it mentions modified Kneser-Ney smoothing and Witten-Bell interpolation smoothing.