Differentiable Grammars for Videos
Authors: AJ Piergiovanni, Anelia Angelova, Michael S. Ryoo11874-11881
AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | It outperforms the state-of-the-art on several challenging datasets and is more accurate for forecasting future activities in videos. |
| Researcher Affiliation | Industry | AJ Piergiovanni, Anelia Angelova, Michael S. Ryoo Robotics at Google {ajpiergi, anelia, mryoo}@google.com |
| Pseudocode | Yes | Algorithm 1 The training of the grammar, with multiple branches |
| Open Source Code | No | We plan to open-source the code.1 |
| Open Datasets | Yes | MLB-You Tube (Piergiovanni and Ryoo 2018a), Charades (Sigurdsson et al. 2016b), and Multi THUMOS (Yeung et al. 2015). We also compare on 50 Salads (Stein and Mc Kenna 2013). |
| Dataset Splits | No | The paper uses standard datasets but does not explicitly provide specific train/validation/test dataset splits (percentages or counts) or refer to a standard, predefined splitting methodology for these experiments. |
| Hardware Specification | No | The paper does not provide any specific hardware details such as CPU/GPU models, memory, or specific cloud instances used for running experiments. |
| Software Dependencies | No | We implemented our models in Py Torch. |
| Experiment Setup | Yes | The learning rate was set to 0.1, decayed every 50 epochs by 10, and the models were trained for 400 epochs. We pruned the number of branches to 2048 by random selection. The number of grammar parameters vary by dataset driven by the number of classes, MLB has 8 terminals (for 8 classes), 5 rules per non-terminal, 8 non-terminals. Charades 157 terminals, 10 rules per non-terminal, 1000 nonterminals. The LSTM has 1000 hidden units for all. |