Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Disentangled Concepts Speak Louder Than Words: Explainable Video Action Recognition
Authors: Jongseo Lee, Wooil Lee, Gyeong-Moon Park, Seong Tae Kim, Jinwoo Choi
NeurIPS 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on four datasets KTH, Penn Action, HAA500, and UCF101 demonstrate that DANCE significantly improves explanation clarity with competitive performance. We validate the superior interpretability of DANCE through a user study. Experimental results also show that DANCE is beneficial for model debugging, editing, and failure analysis. 4 Experimental Results In this section, carefully design and conduct rigorous experiments to answer the following research questions: (1) Does DANCE generate explanations that are easy for humans to interpret in the context of action prediction? (Section 4.1) (2) Can DANCE detect changes in the temporal domain, such as reversed input sequences? (Section 4.1) (3) What is the performance trade-off, if any, when interpretability is introduced into a previously non-interpretable model? (Section 4.2) (4) Can DANCE be effectively used for model debugging and editing? (Section 4.3) |
| Researcher Affiliation | Academia | Jongseo Lee1 Wooil Lee1 Gyeong-Moon Park2 Seong Tae Kim1 Jinwoo Choi1 1Kyung Hee University, Republic of Korea 2Korea University, Republic of Korea EMAIL, EMAIL |
| Pseudocode | No | The paper describes methods in text and figures (e.g., Figure 2, Figure 3) but does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks. |
| Open Source Code | Yes | Our project page is available at https://jong980812.github.io/DANCE/. Justification: We provide code in the supplementary material. |
| Open Datasets | Yes | Experiments on four datasets KTH [45], Penn Action [61], HAA500 [9], and UCF101 [49] demonstrate that DANCE significantly improves explanation clarity with competitive performance. We conduct experiments on four video action recognition datasets: KTH [45], Penn Action [61], HAA500 [9], and UCF-101 [49]. |
| Dataset Splits | Yes | For more details on the dataset and implementation, please refer to the supplementary materials. Justification: The paper specifies all the necessary training and test details, including data splits, hyperparameters, and the type of optimizer used. These details are presented in the supplementary materials. |
| Hardware Specification | Yes | Justification: We report compute resource details including GPU type, memory, and per-experiment training time in the supplementary materials. This includes estimates of training time for each dataset, hardware specifications, ensuring reproducibility. |
| Software Dependencies | No | For each key clip Vs i, we apply a 2D pose estimation model [59] to every frame to obtain a pose sequence Ps i RL J 2, where J is the number of joints. For each action class, we query GPT-4o [19] with two prompts: To avoid manual concept annotation, we employ a vision-language dual encoder [57] to generate concept pseudo labels for each training video Vi. Specific version numbers for these software components or other libraries (e.g., Python, PyTorch) are not provided in the main text. |
| Experiment Setup | No | For more details on the dataset and implementation, please refer to the supplementary materials. Justification: The paper specifies all the necessary training and test details, including data splits, hyperparameters, and the type of optimizer used. These details are presented in the supplementary materials. The main paper mentions 'lambda and alpha are balancing hyperparameters' but does not provide their specific values in the main text. |