The Synthesis of XNOR Recurrent Neural Networks with Stochastic Logic
Authors: Arash Ardakani, Zhengyun Ji, Amir Ardakani, Warren Gross
NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results show that the proposed XNOR LSTMs reduce the computational complexity of their quantized counterparts by a factor of 86 without any sacrifice on latency while achieving a better accuracy across various temporal tasks. (Abstract) and 5 Experimental Results In this section, we evaluate the performance of the proposed XNOR LSTM across different temporal tasks including character-level/word-level language modeling and quation answering (QA). |
| Researcher Affiliation | Academia | Arash Ardakani, Zhengyun Ji, Amir Ardakani, Warren J. Gross Department of Electrical and Computer Engineering, Mc Gill University, Montreal, Canada {arash.ardakani, zhengyun.ji, amir.ardakani}@mail.mcgill.ca warren.gross@mcgill.ca |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide concrete access to source code for the methodology described. |
| Open Datasets | Yes | For the character-level and word-level language modeling, we conduct our experiments on Penn Treebank (PTB) [31] corpus. ... For the QA task, we perform our experiment on the CNN corpus [32]. |
| Dataset Splits | No | The paper describes using Penn Treebank and CNN corpus, but does not provide specific details on the train/validation/test dataset splits (e.g., percentages or exact counts for each split). |
| Hardware Specification | Yes | It is worth mentioning that a full-precision multiplier requires 200 more Xilinx FPGA slices than an XNOR gate [17]. (Introduction) and To this end, we have implemented both the non-stochastic binarized method (e.g., [26]) and our proposed method on a Xilinx Virtex-7 FPGA device where each architecture contains 300 neurons. (Section 6) |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers. |
| Experiment Setup | Yes | For the character-level language modeling (CLLM) experiment, we use an LSTM layer of size 1,000 on a sequence length of 100 when performing PTB. We set the training parameters similar to [31]. (Section 5) and For the word-level language modeling (WLLM) task, we train one layer of LSTM with 300 units on a sequence length of 35 while applying the dropout rate of 0.5. (Section 5) |