Towards Good Practices for Missing Modality Robust Action Recognition

Authors: Sangmin Woo, Sumin Lee, Yeonju Park, Muhammad Adi Nugroho, Changick Kim

AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We report our results on four challenging action recognition benchmarks: NTU RGB+D 60 (Shahroudy et al. 2016), NTU RGB+D 120 (Liu et al. 2019), NW-UCLA (Wang et al. 2014), and UWA3D (Rahmani et al. 2016). We set new stateof-the-art results in both complete and missing modality settings.
Researcher Affiliation Academia Sangmin Woo, Sumin Lee, Yeonju Park, Muhammad Adi Nugroho, Changick Kim Korea Advanced Institue of Science and Technology (KAIST) {smwoo95, suminlee94, yeonju29, madin, changick}@kaist.ac.kr
Pseudocode No The paper includes mathematical equations for the model (e.g., L = λcls Lcls + λrec Lrec), but it does not contain a clearly labeled 'Pseudocode' or 'Algorithm' block, nor is there structured, code-like formatting for any described procedure.
Open Source Code No The paper does not contain an explicit statement or a direct link to a code repository for the methodology described. It references external libraries and previous works, but not its own implementation code.
Open Datasets Yes We used NTU RGB+D 60 (NTU60) (Shahroudy et al. 2016), NTU RGB+D 120 (NTU120) (Liu et al. 2019), NWUCLA (Wang et al. 2014), and UWA3DII (Rahmani et al. 2016) for experiments.
Dataset Splits Yes Following the convention (Liu et al. 2019), we evaluate our model using the cross-subject protocol in both NTU60 and NTU 120. For NW-UCLA, we followed the cross-view protocol suggested in (Wang et al. 2014), using two views (V1, V2) for training and the remaining (V3) for testing. For UWA3DII, we used the top and right views for training and front and left views for testing.
Hardware Specification No The paper describes the experimental setup and training parameters, but it does not specify any particular hardware components such as GPU models, CPU types, or memory specifications used for running the experiments.
Software Dependencies No The paper mentions using 'Adam W optimizer' and 'cross-entropy loss with label smoothing', and references 'PyTorch Image Models' but without providing specific version numbers for these or other software dependencies.
Experiment Setup Yes Settings. We used Adam W optimizer (Loshchilov and Hutter 2017) with an initial learning rate of 10 4 and weight decay of 10 4 for a batch size of 32. The learning rate is linearly decayed by a factor of 10 every 30 epochs. We used crossentropy loss with label smoothing of factor 0.1 (Szegedy et al. 2016).