reproducibilityindex.ai

PyTorch: An Imperative Style, High-Performance Deep Learning Library

Authors: Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, Alban Desmaison, Andreas Kopf, Edward Yang, Zachary DeVito, Martin Raison, Alykhan Tejani, Sasank Chilamkurthy, Benoit Steiner, Lu Fang, Junjie Bai, Soumith Chintala

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	6 Evaluation In this section we compare the performance of Py Torch with several other commonly-used deep learning libraries, and ﬁnd that it achieves competitive performance across a range of tasks. All experiments were performed on a workstation with two Intel Xeon E5-2698 v4 CPUs and one NVIDIA Quadro GP100 GPU. 6.1 Asynchronous dataﬂow We start by quantifying the ability of Py Torch to asynchronously execute dataﬂow on GPU. We use the built-in proﬁler [44] to instrument various benchmarks and record a timeline of the execution of a single training step. 6.2 Memory management We used the NVIDIA proﬁler to trace the execution of the CUDA runtime as well as the execution of the CUDA kernels launched during one training iteration of the Res Net-50 model. 6.3 Benchmarks Finally, we can get an overall sense of single-machine eager mode performance of Py Torch by comparing it to three popular graph-based deep learning frameworks (CNTK, MXNet and Tensor Flow), a deﬁne-by-run framework (Chainer), and production oriented platform (Paddle Paddle). Our results are summarized in Table 1.
Researcher Affiliation	Collaboration	Adam Paszke University of Warsaw adam.paszke@gmail.com Sam Gross Facebook AI Research sgross@fb.com Francisco Massa Facebook AI Research fmassa@fb.com Adam Lerer Facebook AI Research alerer@fb.com James Bradbury Google jekbradbury@gmail.com Gregory Chanan Facebook AI Research gchanan@fb.com Trevor Killeen Self Employed killeent@cs.washington.edu Zeming Lin Facebook AI Research zlin@fb.com Natalia Gimelshein NVIDIA ngimelshein@nvidia.com Luca Antiga Orobix luca.antiga@orobix.com Alban Desmaison Oxford University alban@robots.ox.ac.uk Andreas Köpf Xamla andreas.koepf@xamla.com Edward Yang Facebook AI Research ezyang@fb.com Zach De Vito Facebook AI Research zdevito@cs.stanford.edu Martin Raison Nabla martinraison@gmail.com Alykhan Tejani Twitter atejani@twitter.com Sasank Chilamkurthy Qure.ai sasankchilamkurthy@gmail.com Benoit Steiner Facebook AI Research benoitsteiner@fb.com Lu Fang Facebook lufang@fb.com Junjie Bai Facebook jbai@fb.com Soumith Chintala Facebook AI Research soumith@gmail.com
Pseudocode	Yes	Listing 1: A custom layer used as a building block for a simple but complete neural network. (...) Listing 2: Simpliﬁed training of a generative adversarial networks.
Open Source Code	Yes	This paper introduces Py Torch, a Python library that performs immediate execution of dynamic tensor computations with automatic differentiation and GPU acceleration, and does so while maintaining performance comparable to the fastest current libraries for deep learning.
Open Datasets	Yes	Table 1: Training speed for 6 models using 32bit ﬂoats. Throughput is measured in images per second for the Alex Net, VGG-19, Res Net-50, and Mobile Net models, in tokens per second for the GNMTv2 model, and in samples per second for the NCF model. (...) The Appendix details all the steps needed to reproduce our setup.
Dataset Splits	Yes	The Appendix details all the steps needed to reproduce our setup.
Hardware Specification	Yes	All experiments were performed on a workstation with two Intel Xeon E5-2698 v4 CPUs and one NVIDIA Quadro GP100 GPU.
Software Dependencies	Yes	The Py Torch team. Pytorch Autograd Proﬁler. https://pytorch.org/docs/1.0.1/autograd.html#proﬁler.
Experiment Setup	Yes	The Appendix details all the steps needed to reproduce our setup.