Extracting Automata from Recurrent Neural Networks Using Queries and Counterexamples
Authors: Gail Weiss, Yoav Goldberg, Eran Yahav
ICML 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 8. Experimental Results We demonstrate the effectiveness of our method on networks trained on the Tomita grammars (1982), used as benchmarks in previous automata-extraction work (Wang et al., 2017), and on substantially more complicated languages. We show the effectiveness of our equivalence query approach over simple random sampling and present cases in which our method extracts informative DFAs whereas other approaches fail. In addition, for some seemingly perfect networks, we find that our method quickly returns counterexamples representing deviations from the target language. |
| Researcher Affiliation | Academia | 1Technion, Haifa, Israel 2Bar Ilan University, Ramat Gan, Israel. Correspondence to: Gail Weiss <sgailw@cs.technion.ac.il>. |
| Pseudocode | Yes | Algorithm 1 shows pseudocode for this equivalence checking. |
| Open Source Code | No | The paper mentions implementing methods in Python using Py Torch and scikit-learn, but does not provide a specific link or explicit statement about the availability of their own source code. |
| Open Datasets | Yes | We demonstrate the effectiveness of our method on networks trained on the Tomita grammars (1982), used as benchmarks in previous automata-extraction work (Wang et al., 2017), and on substantially more complicated languages. The train sets contained samples of various lengths, with a 1:1 ratio between the positive and negative samples from each length where possible. |
| Dataset Splits | Yes | Training Setup As our focus was extraction, we trained all networks to 100% accuracy on their train sets, and of these we considered only those that reached 99.9+% accuracy on a dev set consisting of up to 1000 uniformly sampled words of each of the lengths n 1, 4, 7, ..., 28. |
| Hardware Specification | No | The paper does not provide specific details about the hardware (e.g., GPU/CPU models, memory) used for running the experiments. |
| Software Dependencies | No | We implemented all methods in Python, using Py Torch (Paszke et al., 2017) and scikit-learn (Pedregosa et al., 2011). |
| Experiment Setup | Yes | On all networks, we applied our method with initial refinement depth 10. ... For the SVM classifiers, we used the SVC variant, with regularization factor C = 10^4 to encourage perfect splits and otherwise default parameters in particular, the RBF kernel with gamma value 1/(num features). ... The networks were a 2-layer GRU and a 2-layer LSTM, both with hidden size 50 per cell. |