Object Permanence Emerges in a Random Walk along Memory
Authors: Pavel Tokmakov, Allan Jabri, Jie Li, Adrien Gaidon
ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The resulting model outperforms existing approaches on several datasets of increasing complexity and realism, despite requiring minimal supervision, and hence being broadly applicable.We demonstrate that object permanence naturally emerges in this process, evaluating our approach on several dataset of increasing complexity and realism (see Figures 5, 6). |
| Researcher Affiliation | Collaboration | Pavel Tokmakov 1 Allan Jabri 2 Jie Li 1 Adrien Gaidon 1 1Toyota Research Institute 2UC Berkeley. |
| Pseudocode | Yes | Algorithm 1 A step of the inference algorithm in Python style.Algorithm 2 Bounding box prediction algorithm in Python style.Algorithm 3 Bounding box refinement function in Python style. |
| Open Source Code | Yes | Source code, models, and data are publicly available at https://tri-ml.github.io/RAM. |
| Open Datasets | Yes | LA-CATER benchmark (Shamsian et al., 2020), a photo-realistic synthetics PD dataset (Tokmakov et al., 2021), and a real-world, multi-object tracking KITTI dataset (Geiger et al., 2012). |
| Dataset Splits | Yes | There are 9300 training, 3327 validation and 1371 test videos respectively.Following (Tokmakov et al., 2021), we use 583 videos for training and 48 for evaluation, and employ the Track AP metric (Russakovsky et al., 2015; Yang et al., 2019; Dave et al., 2020) as a proxy measure of the model s ability to capture object permanence.Following (Tokmakov et al., 2021), we split the 21 labeled videos in half to obtain a validation set and use the Track AP metric for evaluation. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU models, CPU types, or memory specifications used for running the experiments. |
| Software Dependencies | No | The paper mentions using focal loss but does not provide specific version numbers for any software dependencies or libraries. |
| Experiment Setup | Yes | The temperature parameter τ in Equation 1 is set to 0.1. We use focal loss (Lin et al., 2017) when computing the cross entropy in Equation 3. Radius r in Equation 5 is set to 0.2 H to balance the representational power of the resulting spatio-temporal graph with computational efficiency. Finally, the individual loss weights λRAM, λover in Equation 8 are set to 0.5 and 50 respectively using the validation set of PD.Our model is first trained on PD for 28 epochs with with sequences of length 16, exactly following the optimization procedure described in (Tokmakov et al., 2021)...On LA-CATER we are able to use sequences of length 70... We train the model for 8 epochs with a periodic schedule with step 4, where an epoch is defined as 1000 iterations. |