A Hybrid Attention Mechanism for Weakly-Supervised Temporal Action Localization
Authors: Ashraful Islam, Chengjiang Long, Richard Radke1637-1645
AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate our approach on two popular action localization datasets: THUMOS14 (Jiang et al. 2014) and Activity Net1.2 (Caba Heilbron et al. 2015). Table 2 summarizes performance comparisons between our proposed HAM-Net and state-of-the-art fully-supervised and weakly-supervised TAL methods on the THUMOS14 dataset. |
| Researcher Affiliation | Collaboration | Ashraful Islam1, Chengjiang Long 2, Richard Radke 1 1 Rensselaer Polytechnic Institute 2 JD Digits AI Lab |
| Pseudocode | No | The paper describes its method using mathematical equations and a diagram (Figure 2), but it does not include any clearly labeled pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not contain any explicit statements about releasing source code or provide links to a code repository. |
| Open Datasets | Yes | We evaluate our approach on two popular action localization datasets: THUMOS14 (Jiang et al. 2014) and Activity Net1.2 (Caba Heilbron et al. 2015). |
| Dataset Splits | Yes | THUMOS14 contains 200 validation videos for training and 213 testing videos for testing with 20 action categories. Activity Net1.2 dataset contains 4,819 videos for training and 2,382 videos for testing with 200 action classes. During training we randomly sample 500 snippets for THUMOS14 and 80 snippets for Activity Net, and during evaluation we take all the snippets. |
| Hardware Specification | No | The paper mentions using an 'I3D network' for feature extraction but does not provide any specific details about the hardware (e.g., GPU models, CPU types, memory) used for running the experiments. |
| Software Dependencies | No | The paper mentions using 'Adam (Kingma and Ba 2015) optimizer' and 'I3D network (Carreira and Zisserman 2017)' but does not specify version numbers for these or any other software components (e.g., Python, PyTorch, TensorFlow, CUDA). |
| Experiment Setup | Yes | We use the Adam (Kingma and Ba 2015) optimizer with learning rate 0.00001, and train for 100 epochs for THUMOS14 and 20 epochs for Activity Net. For THUMOS14, we set λ0 = λ1 = 0.8, λ2 = λ3 = 0.2, α = β = 0.8, γ = 0.2, and k = 50 for top-k temporal pooling. For Activity Net, we set α = 0.5, β = 0.1, λ0 = λ1 = λ2 = λ3 = 0.5, and k = 4, and apply additional average pooling to postprocess the final CAS. All the hyperparameters are determined from grid search. |