Slicedit: Zero-Shot Video Editing With Text-to-Image Diffusion Models Using Spatio-Temporal Slices

Authors: Nathaniel Cohen, Vladimir Kulikov, Matan Kleiner, Inbar Huberman-Spiegelglas, Tomer Michaeli

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Through extensive experiments, we demonstrate Slicedit s ability to edit a wide range of real-world videos, confirming its clear advantages compared to existing competing methods.
Researcher Affiliation Academia 1Mines Paris PSL Research University, Paris, France 2Technion Israel Institute of Technology, Haifa, Israel.
Pseudocode Yes H.2. Slicedit: The resulting algorithm, Alg. 3, given with the notations from the main paper is as follows
Open Source Code No The paper mentions that 'competing methods' code is publicly available, but does not provide an explicit statement or link for its own source code.
Open Datasets Yes We evaluate our method on a dataset of videos, which we collected from the DAVIS dataset (Pont-Tuset et al., 2017), the LOVEU-TGVE dataset (Wu et al., 2023b) and from the internet.
Dataset Splits No The paper describes the dataset used but does not specify training, validation, or test splits for it.
Hardware Specification Yes On a single RTX A6000 GPU, the one we used for running all methods including ours, these methods could edit videos of only up to 30 frames.
Software Dependencies No The paper mentions 'Stable Diffusion v2.12' and 'RIFE', but does not provide specific version numbers for software dependencies like programming languages or libraries (e.g., Python, PyTorch versions).
Experiment Setup Yes We set the classifier free guidance (Ho & Salimans, 2021) strength parameter to 10 in ϵEA and to 1 in ϵS. Moreover, we inject the extended attention features from the source video to the target video in 85% of the sampling process. We set γ, the balancing parameter in Eq. (1), to 0.8.