reproducibilityindex.ai

HitNet: Hybrid Ternary Recurrent Neural Network

Authors: Peiqi Wang, Xinfeng Xie, Lei Deng, Guoqi Li, Dongsheng Wang, Yuan Xie

NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We test our method on typical RNN models, such as Long-Short-Term Memory (LSTM) and Gated Recurrent Unit (GRU). Overall, Hit Net can quantize RNN models into ternary values of {-1, 0, 1} and signiﬁcantly outperform the state-of-the-art methods towards extremely quantized RNNs. Speciﬁcally, we improve the perplexity per word (PPW) of a ternary LSTM on Penn Tree Bank (PTB) corpus from 126 to 110.3 and a ternary GRU from 142 to 113.5.
Researcher Affiliation	Academia	1Department of Computer Science and Technology, Tsinghua University 2Beijing National Research Center for Information Science and Technology 3Department of Precision Instrument, Tsinghua University 4Department of Electrical and Computer Engineering, University of California, Santa Barbara
Pseudocode	No	No structured pseudocode or algorithm blocks were found.
Open Source Code	No	No concrete access to source code for the methodology described in this paper was provided.
Open Datasets	Yes	All evaluations in this section adopt an LSTM model with one hidden layer of 300 units. The sequence length is set to 35, and it is applied on Penn Tree Bank (PTB) corpus [30]. The accuracy is measured in perplexity per word (PPW), and a lower value in PPW means a better accuracy. ... We ﬁrst use the Penn Tree Bank (PTB) corpus [30], which contains 10K vocabulary.
Dataset Splits	No	The paper mentions 'validation error' but does not provide specific dataset split information (percentages, sample counts, or citations to predefined splits) for training, validation, or test sets.
Hardware Specification	No	No specific hardware details (e.g., exact GPU/CPU models, processor types, or memory amounts) used for running experiments were provided.
Software Dependencies	No	No specific ancillary software details, such as library names with version numbers, were provided.
Experiment Setup	Yes	We initialize the learning rate as 20 and decrease it by a factor of 4 at the epoch if the validation error exceeds current best record. The sequence length is set to 35 and the gradient norm is clipped into the range of [-0.25, 0.25]. In addition, we set the maximum epoch to be 40 and set dropout rate to be 0.2. ... We use a batch size of 20 for training... We train both LSTM and GRU with one 512-size hidden layer and set the batch size to 50. ... We train the models with one hidden 1024-size layer and set the batch size to be 50.