Deep Neural Networks with Box Convolutions

Authors: Egor Burkov, Victor Lempitsky

NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate the new layer by embedding it into a block that includes the new layer, as well as a residual connection and a 1 1 convolution. We then consider semantic segmentation architectures that have been designed for optimal accuracy-efficiency trade-offs ( E-Net [20] and ERF-Net [21]), and replace analogous blocks based on dilated convolutions inside those architectures with the new block. We show that such replacement allows both to increase the accuracy of the network and to decrease the number of operations and the number of learnable parameters.
Researcher Affiliation Collaboration Egor Burkov 1,2 Victor Lempitsky 1,2 1 Samsung AI Center 2 Skolkovo Institute of Science and Technology (Skoltech) Moscow, Russia
Pseudocode No No pseudocode or algorithm blocks are present in the paper.
Open Source Code Yes The code of the new layer, as well as the implementation of Box ENet and Box ERFNet architectures, are available at the project website (https://github.com/shrubb/box-convolutions).
Open Datasets Yes Both ENet and ERFNet have been fine-tuned to perform well on the Cityscapes dataset for autonomous driving [7]. ... The ENet architecture has also been tuned for the SUN RGB-D dataset [26], which is a popular benchmark for indoors semantic segmentation...
Dataset Splits Yes The dataset ( fine version) consists of 2975 training, 500 validation and 1525 test images of urban environments, manually annotated with 19 classes.
Hardware Specification Yes We further compare the number of operations, the GPU and CPU inference times on a laptop (an NVIDIA GTX 1050 GPU with cu DNN 7.2, a single core of Intel i7-7700HQ CPU), and the number of learnable parameters in Table 3.
Software Dependencies Yes Our approach was implemented using Torch7 library [6]. ... (an NVIDIA GTX 1050 GPU with cu DNN 7.2)
Experiment Setup Yes We use the same bottleneck narrowing factor 4 ... The dropout rate is set to 0.25 where the feature map resolution is lowest (1/8th of the input), and to 0.15 elsewhere. ... We have used standard learning practices (ADAM optimizer [15], step learning rate policy with the same learning rate as in the original papers [20, 21]) and standard error measures.