Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
AsymRnR: Video Diffusion Transformers Acceleration with Asymmetric Reduction and Restoration
Authors: Wenhao Sun, Rong-Cheng Tu, Jingyi Liao, Zhao Jin, Dacheng Tao
ICML 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conducted extensive experiments to evaluate its effectiveness and design choices using state-of-the-art video Di Ts, including Cog Video X (Yang et al., 2024), Mochi-1 (Team, 2024b), Hunyuan Video (Team, 2024c), and Fast Video (Team, 2024a). With Asym Rn R, these models demonstrate significant acceleration with negligible degradation in video quality and, in some cases, even improve performance as evaluated on VBench (Huang et al., 2024). Quantitative Comparison. Table 1 provides qualitative comparisons between two configurations: a base version with perceptually near-lossless quality and a fast version that achieves higher speed at the cost of slight quality degradation. We set the matching cache step to s = 5 and the partition stride to 2 2 2 for both To Me and Asym Rn R. Our higher VBench scores and lower LPIPS, achieved at comparable FLOPs and latency, demonstrate superior video quality and semantic preservation. |
| Researcher Affiliation | Academia | 1College of Computing and Data Science, Nanyang Technological University, Singapore, Singapore 2Institute for Infocomm Research (I2R), A*STAR, Singapore, Singapore. Correspondence to: Rong-Cheng Tu <EMAIL>, Dacheng Tao <EMAIL>. |
| Pseudocode | No | The paper describes methods using equations and prose, but does not include any clearly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | The code is available at https://github.com/ wenhao728/Asym Rn R. |
| Open Datasets | Yes | We follow previous work and perform sampling on over 900 text prompts from the standard VBench suite (Huang et al., 2024). |
| Dataset Splits | No | The paper mentions sampling on the VBench suite but does not specify any training/test/validation splits for the experiments conducted in the paper. |
| Hardware Specification | Yes | Latency is measured using an NVIDIA A100 for Cog Video X variants and an NVIDIA H100 for the rest of models due to the availability of hardware at the time. |
| Software Dependencies | No | The paper does not provide specific version numbers for software dependencies such as Python, PyTorch, or CUDA. |
| Experiment Setup | Yes | The reduction schedule is defined by two hyperparameters: the similarity threshold for reduction and the reduction rates. The similarity threshold is tuned individually for each Di T model to maintain the quality. ... The reduction rates are adjusted to achieve the desired acceleration (e.g., a 1.30 speedup). All schedule specifications are summarized in Table 8. |