Understanding image motion with group representations
Authors: Andrew Jaegle, Stephen Phillips, Daphne Ippolito, Kostas Daniilidis
ICLR 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate that a deep neural network trained using this method captures motion in both synthetic 2D sequences and real-world sequences of vehicle motion, without requiring any labels. Networks trained to respect these constraints implicitly identify the image characteristic of motion in different sequence types. In the context of vehicle motion, this method extracts information useful for localization, tracking, and odometry. Our results demonstrate that this representation is useful for learning motion in the general setting where explicit labels are difficult to obtain. |
| Researcher Affiliation | Academia | Andrew Jaegle , Stephen Phillips , Daphne Ippolito, and Kostas Daniilidis University of Pennsylvania Philadelphia, PA 19104 {ajaegle, stephi, daphnei, kostas}@seas.upenn.edu |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide any concrete access information (e.g., specific repository link, explicit statement of code release) for the source code. |
| Open Datasets | Yes | We trained a network on a dataset consisting of image sequences created from the MNIST dataset. We then show that our method learns features useful for representing motion on KITTI (Geiger et al. (2012)). |
| Dataset Splits | No | Validation errors are given in Table 1. While a validation set is implied, specific details about the split percentages or sample counts are not provided. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., exact GPU/CPU models, memory amounts) used for running its experiments. |
| Software Dependencies | No | All networks were implemented in Torch (Collobert et al. (2011)). No specific version number for Torch or other software dependencies is provided. |
| Experiment Setup | Yes | networks were trained using Adam (Kingma & Ba (2014)). For MNIST training, we used a fixed decay schedule of 30 epochs and with a starting learning rate chosen by random search (1e-2 was a typical value). For MNIST, typical batch sizes were 50-60 sequences, and for KITTI (Geiger et al. (2012)) the batch sizes were typically 25-30 sequences. We used dilated convolutions... We used Re LU nonlinearities and batch normalization... CNN output was passed to an LSTM with 256 hidden units, followed by a linear layer with 256 hidden units. In all experiments, CNN-LSTMs were trained on sequences 3-5 images in length. |