CoFInAl: Enhancing Action Quality Assessment with Coarse-to-Fine Instruction Alignment
Authors: Kanglei Zhou, Junlin Li, Ruizhi Cai, Liyuan Wang, Xingxing Zhang, Xiaohui Liang
IJCAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results on two long-term AQA datasets demonstrate Co FIn Al achieves stateof-the-art performance with significant correlation gains of 5.49% and 3.55% on Rhythmic Gymnastics and Fis-V, respectively. Our code is available at https://github.com/Zhou Kanglei/Co FIn Al_AQA. |
| Researcher Affiliation | Academia | Kanglei Zhou1 , Junlin Li2 , Ruizhi Cai1 , Liyuan Wang3 , Xingxing Zhang3 and Xiaohui Liang1,4 1 State Key Laboratory of Virtual Reality Technology and Systems, Beihang University 2 China Three Gorges University 3 Department of Computer Science and Technology, Institute for AI, BNRist Center, Tsinghua-Bosch Joint ML Center, THBI Lab, Tsinghua University 4 Zhongguancun Laboratory liang_xiaohui@buaa.edu.cn |
| Pseudocode | No | The paper describes its modules and processes textually and through mathematical formulations, but it does not include any clearly labeled 'Pseudocode' or 'Algorithm' blocks. |
| Open Source Code | Yes | Our code is available at https://github.com/Zhou Kanglei/Co FIn Al_AQA. |
| Open Datasets | Yes | The Rhythmic Gymnastics (RG) dataset [Zeng et al., 2020] comprises a total of 1000 videos... The Figure Skating Video (Fis-V) dataset [Pirsiavash et al., 2014; Parmar and Tran Morris, 2017] consists of 500 videos... |
| Dataset Splits | Yes | The dataset is divided into training and evaluation sets, with 200 videos allocated for training and 50 for evaluation in each action category. ... Adhering to the official split, the dataset is divided into 400 training videos and 100 testing videos. |
| Hardware Specification | No | The paper states 'Co FIn Al is implemented using Py Torch on a GPU for efficient parallel processing,' but it does not specify any particular GPU model, CPU model, or other hardware details. |
| Software Dependencies | No | The paper mentions 'PyTorch' as the implementation framework but does not provide specific version numbers for PyTorch or any other software dependencies. |
| Experiment Setup | Yes | The feature dimensions DC, DP, DS are set to 1024, 512, and 256, respectively. ... We optimize all models using SGD with a momentum of 0.9. The batch size is 32, and the learning rate starts at 0.01, gradually decreasing to 0.0001 through a cosine annealing strategy. The number of epochs is set to 200. The loss weights λC, λF, λR are set to 1. To further regularize the models, we apply a dropout of 0.3/0.7 for RG/Fis-V, and the weight decay is set to 0.01. |