Localizing Unseen Activities in Video via Image Query
Authors: Zhu Zhang, Zhou Zhao, Zhijie Lin, Jingkuan Song, Deng Cai
IJCAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Furthermore, we construct a new dataset Activity IBAL by reorganizing the Activity Net dataset. The extensive experiments show the effectiveness of our method. |
| Researcher Affiliation | Academia | Zhu Zhang1 , Zhou Zhao1 , Zhijie Lin1 , Jingkuan Song2 and Deng Cai3 1College of Computer Science, Zhejiang University, China 2University of Electronic Science and Technology of China, China 3Stata Key Lab of CAD&CG, Zhejiang University, China |
| Pseudocode | No | The paper does not contain any pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide any specific statement or link regarding the availability of its source code. |
| Open Datasets | Yes | Since none of existing datasets can be directly used for imagebased activity localization, we construct a new dataset Activity IBAL by reorganizing the Activity Net [Caba Heilbron et al., 2015]. |
| Dataset Splits | Yes | Next, we discard some videos with too short or long action segments, and divide these newly generated videos into three subsets according to their action classes: 160 classes for training, 20 classes for validation and the remaining 20 classes for testing. |
| Hardware Specification | No | The paper mentions 'To avoid exceeding GPU load', implying the use of GPUs, but does not provide any specific details about the models of GPUs or other hardware used for experiments. |
| Software Dependencies | No | The paper mentions using a 'pre-trained Faster R-CNN with VGG-16 backbone networks' and a 'pre-trained 3D-Conv Net' but does not specify version numbers for these or any other software components. |
| Experiment Setup | Yes | In the proposed model, we set the projection dimension of various kinds of attention to 256. During the training process, we adopt an Adam optimizer and the initial learning rate is set to 0.0005. |