Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
DepMGNN: Matrixial Graph Neural Network for Video-based Automatic Depression Assessment
Authors: Zijian Wu, Leijing Zhou, Shuanglin Li, Changzeng Fu, Jun Lu, Jing Han, Yi Zhang, Zhuang Zhao, Siyang Song
AAAI 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments show that the explicit spatio-temporal modeling on 2D facial feature maps, facilitated by our matrixial graph/MGNN, provided significant benefits, leading our approach to achieve new state-of-the-art performances on AVEC2013 and AVEC2014 datasets with large advantages. We evaluate our approach on two visual depression assessment benchmark datasets: AVEC 2013 (Valstar et al. 2013) and AVEC 2014 (Valstar et al. 2014). To deeply investigate the effectiveness of our approach, we conduct a series of ablation studies on AVEC 2014 dataset. |
| Researcher Affiliation | Collaboration | 1Affect AI, Anhui, China 2University of Exeter, UK 3Nanjing University of Science and Technology, China 4Zhejiang University, China 5University of Newcastle-upon-Tyne, UK 6Osaka University, Japan |
| Pseudocode | No | The paper describes the methodology using textual explanations and mathematical equations, but it does not include any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code https://github.com/Affect AI/MGNN |
| Open Datasets | Yes | Datasets. We evaluate our approach on two visual depression assessment benchmark datasets: AVEC 2013 (Valstar et al. 2013) and AVEC 2014 (Valstar et al. 2014). |
| Dataset Splits | Yes | Both datasets were evenly divided into three subsets: training, development and testing. |
| Hardware Specification | No | More details of the employed datasets, implementation (hyper-parameters, libraries and hardware) and metrics are provided in Supplementary Material. |
| Software Dependencies | No | More details of the employed datasets, implementation (hyper-parameters, libraries and hardware) and metrics are provided in Supplementary Material. |
| Experiment Setup | Yes | Training: Our entire framework is trained in a straightforward end-to-end manner, where the BMC loss (Ren et al. 2022) that addresses issues caused by unbalanced training data (i.e., depression label distribution) and the Mean Square Error (MSE) loss are joint employed with the same weight to compare the prediction ypred and ground-truth y as: L = y ypred 2 2 | {z } MSE Loss log exp( y ypred 2 2 /τ) PB i=1 exp( yi ypred 2 2 /τ) | {z } BMC Loss (4) where 2 denotes the L2 norm; B represents the training batch size; and τ is a temperature coefficient that is empirically set as 2 in this paper. Implementation details. The face region of each video frame is first cropped and resized to 224 224. Then, we choose Res Net50 (He et al. 2016) and SENet (Hu, Shen, and Sun 2018) pre-trained on VGGFace2 (Cao et al. 2018) as our backbones. |