The Synthesis of XNOR Recurrent Neural Networks with Stochastic Logic

Authors: Arash Ardakani, Zhengyun Ji, Amir Ardakani, Warren Gross

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results show that the proposed XNOR LSTMs reduce the computational complexity of their quantized counterparts by a factor of 86 without any sacrifice on latency while achieving a better accuracy across various temporal tasks. (Abstract) and 5 Experimental Results In this section, we evaluate the performance of the proposed XNOR LSTM across different temporal tasks including character-level/word-level language modeling and quation answering (QA).
Researcher Affiliation Academia Arash Ardakani, Zhengyun Ji, Amir Ardakani, Warren J. Gross Department of Electrical and Computer Engineering, Mc Gill University, Montreal, Canada {arash.ardakani, zhengyun.ji, amir.ardakani}@mail.mcgill.ca warren.gross@mcgill.ca
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code No The paper does not provide concrete access to source code for the methodology described.
Open Datasets Yes For the character-level and word-level language modeling, we conduct our experiments on Penn Treebank (PTB) [31] corpus. ... For the QA task, we perform our experiment on the CNN corpus [32].
Dataset Splits No The paper describes using Penn Treebank and CNN corpus, but does not provide specific details on the train/validation/test dataset splits (e.g., percentages or exact counts for each split).
Hardware Specification Yes It is worth mentioning that a full-precision multiplier requires 200 more Xilinx FPGA slices than an XNOR gate [17]. (Introduction) and To this end, we have implemented both the non-stochastic binarized method (e.g., [26]) and our proposed method on a Xilinx Virtex-7 FPGA device where each architecture contains 300 neurons. (Section 6)
Software Dependencies No The paper does not provide specific software dependencies with version numbers.
Experiment Setup Yes For the character-level language modeling (CLLM) experiment, we use an LSTM layer of size 1,000 on a sequence length of 100 when performing PTB. We set the training parameters similar to [31]. (Section 5) and For the word-level language modeling (WLLM) task, we train one layer of LSTM with 300 units on a sequence length of 35 while applying the dropout rate of 0.5. (Section 5)