reproducibilityindex.ai

Learning Deterministic Weighted Automata with Queries and Counterexamples

Authors: Gail Weiss, Yoav Goldberg, Eran Yahav

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We apply our algorithm to 2-layer LSTMs trained on grammars from the SPi Ce competition [7], adaptations of the Tomita grammars [34] to PDFAs, and small PDFAs representing languages with unbounded history. The LSTMs have input dimensions 2-60 and hidden dimensions 20-100. The LSTMs and their training methods are fully described in Appendix E. Compared Methods We compare our algorithm to the sample-based method ALERGIA [9], the spectral algorithm used in [2], and n-grams. Tables 1 and 2 show the results of extraction from the SPi Ce and UHL RNNs, respectively.
Researcher Affiliation	Academia	Technion sgailw@cs.technion.ac.il Yoav Goldberg Bar Ilan University Allen Institute for AI yogo@cs.biu.ac.il Technion yahave@cs.technion.ac.il
Pseudocode	No	The paper describes the algorithm steps verbally and in an enumerated list but does not provide a formally labeled pseudocode block or algorithm figure.
Open Source Code	Yes	An implementation of the algorithm 2 and an evaluation over extraction from LM-RNNs, including a comparison to other LM reconstruction techniques. 2Available at www.github.com/tech-srl/weighted_lstar
Open Datasets	Yes	We apply our algorithm to 2-layer LSTMs trained on grammars from the SPi Ce competition [7], adaptations of the Tomita grammars [34] to PDFAs, and small PDFAs representing languages with unbounded history.
Dataset Splits	No	The paper describes training RNNs and then evaluating extracted models by sampling from the RNNs. It does not provide explicit train/validation/test dataset splits (e.g., percentages or counts) for the grammars or for the RNN training itself.
Hardware Specification	No	The paper does not provide specific details about the hardware (e.g., CPU, GPU models, memory) used for running the experiments.
Software Dependencies	No	The paper mentions tools like FLEXFRINGE but does not provide specific version numbers for any software dependencies or libraries used in the experiments.
Experiment Setup	Yes	Extraction Parameters Most of the extraction parameters differ between the RNNs, and are described in the results tables (1, 2). For our algorithm, we always limited the equivalence query to 500 samples. ... Our algorithm was run with t=0.1, "P , "S=0.01, \|P\| 5000 and \|S\| 100, and spectral with \|P\|, \|S\|=1000, with some exceptions: :t=0.05, "S=0, :\|P\|, \|S\|=750, :state_count=10, 000.