Multimodal Fusion via Teacher-Student Network for Indoor Action Recognition
Authors: Bruce X.B. Yu, Yan Liu, Keith C.C. Chan3199-3207
AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | With extensive experiments on two benchmarking datasets: NTU RGB+D and PKU-MMD, results show that the proposed TSMF consistently performs better than state-of-the-art single modal and multimodal methods. |
| Researcher Affiliation | Academia | Bruce X.B. Yu, Yan Liu,* Keith C.C. Chan Department of Computing, The Hong Kong Polytechnic University {csxbyu, csyliu}@comp.polyu.edu.hk, keithccchan@gmail.com |
| Pseudocode | No | The paper does not contain any blocks explicitly labeled as 'Pseudocode' or 'Algorithm'. |
| Open Source Code | Yes | 1Code is available: https://github.com/bruceyo/TSMF |
| Open Datasets | Yes | NTU RGB+D. The NTU RGB+D dataset (Shahroudy et al. 2016) was collected with Kinect v2 sensors PKU-MMD. The PKU-MMD dataset (Liu et al. 2017a) is another popular large dataset collected with Kinect v2. |
| Dataset Splits | Yes | We followed the Cross-Subject (CS) and Cross-View (CV) split settings from (Shahroudy et al. 2016) for evaluating our method. Similar with NTU RGB+D, we adopt the two evaluation protocols (i.e., cross-subject and cross-view) recommended in (Liu et al. 2017a). |
| Hardware Specification | Yes | All experiments are conducted on a workstation with 4 GTX 1080 Ti GPUs. |
| Software Dependencies | No | The paper mentions software components like 'Res Net' and 'Open Pose tool' and 'stochastic gradient descent optimizer' but does not specify their version numbers. |
| Experiment Setup | Yes | The initial learning rate is set as 0.1, which is decayed by 0.1 at epochs 10 and 50 and ended at the epoch 80. The minibatch size is set to 64. |