reproducibilityindex.ai

Bit-Pragmatic Deep Neural Network Computing

Authors: Jorge Albericio, Patrick Judd, Alberto Delmas, Sayeh Sharify, Andreas Moshovos

ICLR 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Measurements demonstrate that for the convolutional layers on Convolutional Neural Networks and during inference, PRA improves performance by 4.3x over the Da Dia Nao (Da DN) accelerator Chen et al. (2014) and by 4.5x when Da DN uses an 8-bit quantized representation Warden (2016). Experimental measurements with recent CNNs for image classiﬁcation demonstrate that most straightforward PRA variant, boosts average performance for the convolutional layers to 2.59x over the state-of-the-art Da DN accelerator.
Researcher Affiliation	Academia	Jorge Albericio , Patric Judd, Alberto Delmas Lascorz, Sayeh Sharify & Andreas Moshovos Electrical and Computer Engineering University of Toronto Toronto, ON, M5S 3G4, Canada {jorge, juddpatr, delmasl1,sayeh,moshovos}@ece.utoronto.ca
Pseudocode	No	The paper describes the Pragmatic engine and its units (e.g., Figure 2b, Figure 4) but does not provide structured pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide an explicit statement or link to its own open-source code for the described methodology. It cites external resources such as 'https://github.com/google/gemmlowp' (Google, 2016) but these are not the authors' own implementation.
Open Datasets	Yes	Experimental measurements with recent CNNs for image classiﬁcation demonstrate that most straightforward PRA variant, boosts average performance for the convolutional layers to 2.59x over the state-of-the-art Da DN accelerator. Table 2: Per convolutional layer activation precision proﬁles. Alex Net ... VGG 19.
Dataset Splits	No	The paper mentions using 'recent CNNs for image classiﬁcation' and provides 'Per convolutional layer activation precision proﬁles' in Table 2, but does not explicitly state the dataset splits (e.g., training, validation, test percentages or counts) or refer to standard predefined splits for these networks.
Hardware Specification	No	The paper mentions that 'designs were synthesized with the Synopsis Design Compiler Synopsys for a TSMC 65nm library' and memory blocks were 'modeled using CACTI' and 'Destiny', but it does not specify the particular hardware (e.g., CPU, GPU models, or compute cluster specifications) on which the simulations or experiments were executed.
Software Dependencies	No	The paper mentions software tools like 'Synopsis Design Compiler Synopsys', 'CACTI', and 'Destiny' for modeling and synthesis, and references 'TensorFlow' for quantization. However, it does not provide specific version numbers for any of these software dependencies.
Experiment Setup	Yes	This section evaluates the single-stage shifting PRA conﬁguration of Sections 5 5.1, and the 2-stage shifting variants of Section 5.1. Section 6.1 reports performance while Section 6.1 reports area and power. In this section, All PRA systems use pallet synchronization. Conﬁguration PRAx R 2b refers to a conﬁguration using x SSRs. This work investigates a software guided approach where the precision requirements of each layer are used to zero out a number of preﬁx and sufﬁx bits at the output of each layer.