Learning Disentangled Classification and Localization Representations for Temporal Action Localization

Authors: Zixin Zhu, Le Wang, Wei Tang, Ziyi Liu, Nanning Zheng, Gang Hua3644-3652

AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate our proposed method on two popular benchmarks for TAL, which outperforms all state-of-the-art methods. ... Experiments Datasets. The THUMOS14 (Jiang et al. 2014) dataset provides temporal annotations for 20 action categories. ... Activity Net v1.3 (Heilbron et al. 2015) is currently the largest dataset of action analysis in videos... Ablation Study. In order to explore the effectiveness of our disentanglement network and how disentangled features are better than original features, we conducted in-depth ablation experiments.
Researcher Affiliation Collaboration Zixin Zhu1, Le Wang1*, Wei Tang2, Ziyi Liu3, Nanning Zheng1, Gang Hua3 1Institute of Artificial Intelligence and Robotics, Xi an Jiaotong University 2University of Illinois at Chicago 3Wormpex AI Research
Pseudocode No The paper describes its framework and components with text and diagrams but does not include formal pseudocode blocks or algorithms.
Open Source Code No The paper does not include any explicit statements about releasing source code or provide a link to a code repository for their method.
Open Datasets Yes The THUMOS14 (Jiang et al. 2014) dataset provides temporal annotations for 20 action categories. ... Activity Net v1.3 (Heilbron et al. 2015) is currently the largest dataset of action analysis in videos...
Dataset Splits Yes Following the common setting in THUMOS14, we apply 200 videos (including 3,007 action instances) in the validation set for training and conduct evaluation on the 213 annotated videos (including 3,358 action instances) from the test set. ... The training set contains about 10,000 untrimmed videos. Both the validation set and the test set contain about 5,000 untrimmed videos.
Hardware Specification No The paper does not specify any particular hardware used for running the experiments, such as CPU or GPU models.
Software Dependencies No The paper mentions using specific networks/models like I3D, BSN, and Untrimmed Net but does not provide specific software dependencies (e.g., programming languages, libraries, frameworks) with version numbers.
Experiment Setup Yes The interval between snippets is set to 16 frames. ... The ratio of fusing the RGB and optical flow predictions is 5:6. ... In all experiments, we set λ1 = λ2 = 0.5.