SpotEM: Efficient Video Search for Episodic Memory
Authors: Santhosh Kumar Ramakrishnan, Ziad Al-Halah, Kristen Grauman
ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiments on 200+ hours of video from the Ego4D EM Natural Language Queries benchmark and three different EM models demonstrate the effectiveness of our approach: computing only 10% 25% of the clip features, we preserve 84% 97% of the original EM model s accuracy. |
| Researcher Affiliation | Collaboration | 1UT Austin 2University of Utah 3FAIR, Meta AI. Correspondence to: S. Ramakrishnan <sramakrishnan@utexas.edu>. |
| Pseudocode | No | The paper does not contain explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | Project page: https:// vision.cs.utexas.edu/projects/spotem |
| Open Datasets | Yes | We evaluate our approach on the large-scale EM NLQ benchmark from Ego4D (Grauman et al., 2022), which is the only public dataset supporting this task to our knowledge. |
| Dataset Splits | Yes | The dataset contains 11.3k/3.9k/4.0k queries annotated over 136/45/46 hours of train/val/test videos. |
| Hardware Specification | No | The paper does not specify the hardware used for experiments (e.g., specific GPU or CPU models). |
| Software Dependencies | No | The paper mentions using PyTorch for implementation but does not specify version numbers for any software dependencies. |
| Experiment Setup | Yes | We provide the hyperparameters for training Spot EM in Table 3. |