Towards Smooth Video Composition

Authors: Qihang Zhang, Ceyuan Yang, Yujun Shen, Yinghao Xu, Bolei Zhou

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate our approach on a range of datasets and show substantial improvements over baselines on video generation.
Researcher Affiliation Collaboration Qihang Zhang1 Ceyuan Yang2 Yujun Shen3 Yinghao Xu1 Bolei Zhou4 1The Chinese University of Hong Kong, 2Shanghai AI Laboratory, 3Ant Group, 4University of California, Los Angeles
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code Yes Code and models are publicly available at https://genforce.github. io/Style SV.
Open Datasets Yes We evaluate our approach on a range of datasets and show substantial improvements over baselines on video generation. Code and models are publicly available at https://genforce.github. io/Style SV.
Dataset Splits No The paper mentions evaluating results with the highest FVD16 score after training, which implies a validation step, but it does not explicitly provide specific dataset split information (percentages, counts, or predefined splits) for training, validation, and testing needed to reproduce the data partitioning.
Hardware Specification Yes We follow the training receipt of Style GAN-V and train models on a server with 8 A100 GPUs.
Software Dependencies No The paper states, 'Our method is developed based on the official implementation of Style GAN-V (Skorokhodov et al., 2022),' but it does not provide specific version numbers for software components like Python, PyTorch, or CUDA.
Experiment Setup Yes In terms of various methods and datasets, we grid search the R1 regularization weight, whose details are available in Appendix. Empirically, we find that a smaller R1 value (e.g., 0.25) works well for pretraining stage (Config-C). While a larger R1 value (e.g., 4) better suits to video generation learning.