Warped Convolutions: Efficient Invariance to Spatial Transformations

Authors: João F. Henriques, Andrea Vedaldi

ICML 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental 6. Experiments, Dataset., Implementation., Baselines and results., Table 1. Results of scale and rotation pose estimation of vehicles in the Google Earth dataset (errors in pixels and degrees, resp.).
Researcher Affiliation Academia 1Visual Geometry Group, University of Oxford, United Kingdom.
Pseudocode Yes Algorithm 1 Warped Convolution
Open Source Code No The paper does not provide a statement about releasing open-source code or a link to a code repository.
Open Datasets Yes The Google Earth dataset (Heitz & Koller, 2008) contains bounding box annotations...For this task we use the Annotated Facial Landmarks in the Wild (AFLW) dataset (Koestinger et al., 2011).
Dataset Splits Yes We use the first 10 for training and the rest for validation. 20% of the faces were set aside for validation.
Hardware Specification No The paper mentions 'GPU hardware' but does not specify any particular GPU model, CPU, memory, or other specific hardware components used for running the experiments.
Software Dependencies No The paper states networks were 'implemented in Mat Conv Net (Vedaldi & Lenc, 2015)' but does not provide specific version numbers for software dependencies or other libraries.
Experiment Setup Yes The CNN block contains 3 convolutional layers with 3x3 filters, with 50, 20 and 50 output channels respectively. We use dilation factors of 2, 4 and 8 respectively...There is a batch normalization and Re LU layer after each convolution, and a 3x3 max-pooling operator (stride 2)...All networks are trained for 40 epochs using the ADAM solver...The main CNN has 4 convolutional layers, the first two with 5x5 filters, the others being 9x9. The numbers of output channels are 20, 50, 20 and 50, respectively. A 3x3 max-pooling with a stride of 2 is performed after the first layer, and there are Re LU non-linearities between the others.