DeepFix: Fixing Common C Language Errors by Deep Learning
Authors: Rahul Gupta, Soham Pal, Aditya Kanade, Shirish Shevade
AAAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | On a set of 6971 erroneous C programs written by students for 93 programming tasks, Deep Fix could fix 1881 (27%) programs completely and 1338 (19%) programs partially. ... We apply Deep Fix on C programs written by students for 93 different programming tasks in an introductory programming course. ... Experiments Experimental Setup For training and evaluation, we used C programs written by students for 93 different programming tasks... |
| Researcher Affiliation | Academia | Rahul Gupta, Soham Pal, Aditya Kanade, Shirish Shevade Department of Computer Science and Automation, Indian Institute of Science, Bangalore, India {rahul.gupta, soham.pal, kanade, shirish}@csa.iisc.ernet.in |
| Pseudocode | No | The paper describes the Deep Fix approach and iterative repair strategy in detail, but it does not include any formal pseudocode blocks or algorithms. |
| Open Source Code | Yes | We provide the source code of the tool online at http://iisc-seal.net/deepfix. |
| Open Datasets | No | The paper mentions using 'C programs written by students for 93 different programming tasks' collected via 'a web-based tutoring system (Das et al. 2016)'. While it cites the system, it does not explicitly state that the dataset itself is publicly available or provide a direct link to it. |
| Dataset Splits | Yes | In order to give an accurate evaluation of our technique, we do a 5-fold cross validation by holding out roughly 1/5th of the programming tasks for each fold. |
| Hardware Specification | Yes | We train the neural networks on an Intel(R) Xeon(R) E5-2640 v3 16-core machine clocked at 2.60GHz with 125GB of RAM and equipped with an NVIDIA Tesla K40 GPU accelerator. |
| Software Dependencies | No | The paper states, 'We use the attention based sequence-to-sequence architecture implemented in Tensorflow (Abadi et al. 2015).' However, it does not provide a specific version number for TensorFlow or any other software component used. |
| Experiment Setup | Yes | Both the encoder and the decoder in our network have 4 stacked GRU layers with 300 cells in each layer. We use dropout (Srivastava et al. 2014) at a rate of 0.2 on the non-recurrent connections (Pham et al. 2014). The initial weights are drawn from the distribution U( 0.07, 0.07) and biases are initialized to 1.0. Our vocabulary has 129 unique tokens, each of which is embedded into a 50-dimensional vector. The network is trained using the Adam optimizer (Kingma and Ba 2015) with the learning and the decay rates set to their default values and a mini-batch size of 128. We clip the gradients to keep them within the range [ 1, 1] and train the network for up to 20 epochs. |