Riemannian Local Mechanism for SPD Neural Networks

Authors: Ziheng Chen, Tianyang Xu, Xiao-Jun Wu, Rui Wang, Zhiwu Huang, Josef Kittler

AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments involving multiple visual tasks validate the effectiveness of our approach. We evaluate the proposed MSNet in three challenging visual classification tasks
Researcher Affiliation Academia 1School of Artificial Intelligence and Computer Science, Jiangnan University 2School of Computing and Information Systems, Singapore Management University 3Centre for Vision, Speech and Signal Processing (CVSSP), University of Surrey
Pseudocode No The paper includes Figure 1 which is an "Illustration of the proposed Multi-scale Submanifold Network (MSNet)", but this is a conceptual diagram, not pseudocode or an algorithm block. No explicit "Algorithm" or "Pseudocode" section found.
Open Source Code Yes The supplement and source code can be found in https://github.com/Git ZH-Chen/MSNet.git.
Open Datasets Yes Cambridge-Gesture (CG) (Kim and Cipolla 2008) and the UCF-101 (Soomro, Zamir, and Shah 2012) datasets, and skeleton-based action recognition with the First-Person Hand Action (FPHA) (Garcia-Hernando et al. 2018) dataset
Dataset Splits Yes For this dataset, following the criteria in (Chen et al. 2020), we randomly select 20 and 80 clips for training and testing per class, respectively. For a fair comparison, we follow the protocols in (Wang, Wu, and Kittler 2021). In detail, we use 600 action clips for training and 575 for testing. The seventy-thirty-ratio (STR) protocol is exploited to build the gallery and probes.
Hardware Specification Yes For training our MSNet, we use an i5-9400 (2.90GHz) CPU with 8GB RAM.
Software Dependencies No No specific software versions (e.g., Python 3.x, PyTorch 1.x) are mentioned. Only "source code" is mentioned, but not the specific environment dependencies with versions.
Experiment Setup Yes The initial learning rate is λ = 1e 2 and reduced by 0.8 every 50 epochs to a minimum of 1e 3. Besides, the batch size is set to 30, and the weights in Bi Map layers are initialized as random semi-orthogonal matrices.