reproducibilityindex.ai

SkipW: Resource Adaptable RNN with Strict Upper Computational Limit

Authors: Tsiry Mayet, Anne Lambert, Pascal Leguyadec, Francoise Le Bolzer, François Schnitzler

ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate this approach on four datasets: a human activity recognition task, sequential MNIST, IMDB and adding task. Our results show that Skip-Window is often able to exceed the accuracy of existing approaches for a lower computational cost while strictly limiting said cost.
Researcher Affiliation	Industry	Inter Digital Inc. Cesson-S evign e, France {firstname.lastname}@interdigital.com
Pseudocode	No	The paper contains equations and architectural descriptions but no clearly labeled pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide any explicit statement about open-sourcing the code or a link to a code repository.
Open Datasets	Yes	We evaluate our approach on four data sets. Human Activity Recognition (HAR) (Oﬂi et al., 2013)... Sequential MNIST (Lecun et al., 1998)... Adding Task Hochreiter & Schmidhuber (1997)... IMDB (Maas et al., 2011)...
Dataset Splits	Yes	HAR: The dataset is split into 2 independent partitions : 22625 sequences for training and 5751 for validation. Sequential MNIST: We follow the standard data split and set aside 5,000 training samples for validation purposes. IMDB: We set aside about 15% of training data for validation purposes.
Hardware Specification	Yes	We implement the full service, from images to activity recognition, on a Nvidia Jetson Nano platform... We evaluate the performance of Skip W on small hardware... Results on Jetson TX2 and Raspberry Pi4 lead to similar conclusions (Appendix G). The hardware speciﬁcation of these different devices is provided in Table 3.
Software Dependencies	No	The paper mentions software like Open Pose and Pose Net (Mobile Net V1 architecture with a 0.75 multiplier) but does not provide specific version numbers for any software dependencies.
Experiment Setup	Yes	The model is trained with batches of 512 sequences using a decaying learning rate for 600 epochs. The model architecture consists of a two-stacked RNN of 60 GRU cells each followed by a fully connected layer with a RELU activation function. The following parameters were included in the search: Batch size: 4096 and 512 λ {1e 4, 1e 3, 1e 2} Cell type: LSTM or GRU Number of cells {30, 40, 50, 60} per layer (identical number of cells in each layer) Window size L {4, 8, 16} (Skip W only)