Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Prompt-guided Disentangled Representation for Action Recognition
Authors: tianci wu, Guangming Zhu, Lu jiang, Siyuan Wang, Ning Wang, Nuoye Xiong, Liang Zhang
NeurIPS 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on two complex video action datasets, Charades and Sports HHI, demonstrate the effectiveness of our approach against state-of-the-art methods. |
| Researcher Affiliation | Academia | Tianci Wu, Guangming Zhu , Jiang Lu, Siyuan Wang, Ning Wang, Nuoye Xiong, Zhang Liang School of Computer Science and Technology, Xidian University EMAIL EMAIL |
| Pseudocode | No | The paper describes methods in paragraph text and equations, but no structured pseudocode or algorithm blocks are explicitly presented. |
| Open Source Code | Yes | Our code can be found in https://github.com/iamsnaping/Pro DA.git. |
| Open Datasets | Yes | Dataset. (1) The Charades dataset [40] consists of 9,848 videos... (2) The Sports HHI dataset [50] is a specially designed dataset focusing on human-human interaction in sports. |
| Dataset Splits | Yes | On Charades, all experiments are conducted under the same design. For each video, the network takes N frames as input. During training, frames are randomly sampled from the video, while during validation, the frames are extracted evenly over the whole video. [...] We conduct domain shift experiments on the Charades dataset to evaluate the generalization ability of our method. Specifically, we follow the experimental setup of [47], splitting Charades into five disjoint subsets with different action groups. The details are shown in Table 10. |
| Hardware Specification | No | Justification: Since some of the compared methods do not provide complete computational resource information (and some do not release their code), it is challenging for us to perform a fair comparison regarding computational cost. |
| Software Dependencies | No | The paper mentions various models and network types (e.g., CNN, ViT, GNN, Dual-AI, CLIP-B/16) and their underlying principles (e.g., self-attention) but does not provide specific software dependencies like programming language versions or library versions. |
| Experiment Setup | Yes | For each video, we uniformly sample 16 frames as input. We adopt a two-stage training strategy for all experiments. The first stage (pre-train) runs for 5 epochs with a learning rate of 2e-4, aiming to learn the representations for au and as. In the second stage (post-train), we load the checkpoint from the epoch with the best performance on au and as in the pre-training stage, and only train the classifier for at. This stage is also trained for 5 epochs with a learning rate of 1e-4. |