DWM: A Decomposable Winograd Method for Convolution Acceleration

Authors: Di Huang, Xishan Zhang, Rui Zhang, Tian Zhi, Deyuan He, Jiaming Guo, Chang Liu, Qi Guo, Zidong Du, Shaoli Liu, Tianshi Chen, Yunji Chen4174-4181

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments show that DWM achieves 2 acceleration while keeping the numerical error under E-07, which is close to the numerical accuracy of FP32 convolution. Experiments Setup All the results were tested on NVIDIA V100 GPU.
Researcher Affiliation Collaboration 1SKL of Computer Architecture, Institute of Computing Technology, CAS 2University of Chinese Academy of Sciences, 3Cambricon Tech. Ltd 4Institute of Brain-Intelligence Technology, Zhangjiang Laboratory 5Shanghai Research Center for Brian Science and Brain-Inspired Instelligence 6CAS Center for Excellence in Brain Science and Intelligence Technology {huangdi18s, zhangxishan, zhangrui, zhitian, hedeyuan18s, guojiaming18s, liuchang18s, guoqi, duzidong, cyj}@ict.ac.cn {liushaoli, tchen}@cambricon.com
Pseudocode No The paper describes methods in text and uses equations and diagrams (like Figure 3 showing a process flow) but does not present structured pseudocode or algorithm blocks.
Open Source Code No The paper mentions implementing DWM on TensorFlow and PyTorch but does not provide an explicit statement about releasing the source code or a link to a repository.
Open Datasets Yes The Networks accuracy was measured on Image Net 2012 (Russakovsky et al. 2015).
Dataset Splits No The paper mentions using ImageNet 2012 for accuracy measurement and refers to "test" but does not explicitly provide specific details on training, validation, and test dataset splits (e.g., exact percentages, sample counts, or detailed splitting methodology).
Hardware Specification Yes Experiments Setup All the results were tested on NVIDIA V100 GPU.
Software Dependencies No The paper mentions 'Tensor Flow (Abadi et al. 2016) and Py Torch (Paszke et al. 2017)' but does not provide specific version numbers for these or any other software dependencies.
Experiment Setup Yes For all single layer tests, the batch size is set to 256 and the layers are the same padded. We estimated the mean squared error (MSE) between several methods and the FP64 results by doing a forward convolution. The input signal and convolution weights are random numbers generated by standard normal distribution using Numpy, set seed 11. ... The batch size, the channels and the filters are 256. The input size is fixed to 14 14.