reproducibilityindex.ai

Localize, Understand, Collaborate: Semantic-Aware Dragging via Intention Reasoner

Authors: Xing Cui, Peipei Li, Zekun Li, Xuannan Liu, Yueying Zou, Zhaofeng He

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Both qualitative and quantitative comparisons demonstrate the superiority of Lucid Drag over previous methods.
Researcher Affiliation	Academia	1Beijing University of Posts and Telecommunications 2University of California, Santa Barbara
Pseudocode	Yes	A.2 Algorithm Pipeline of Lucid Drag To facilitate the understanding of our Lucid Drag, we present the entire algorithm pipeline in Algorithm 1. Algorithm 1: Proposed Lucid Drag
Open Source Code	Yes	Code is available at: https://github.com/cuixing100876/Lucid Drag-Neur IPS2024.
Open Datasets	Yes	Following Drag Diffusion [53], we utilize the Drag Bench benchmark which is designed for the image-dragging task.
Dataset Splits	No	No explicit mention of validation dataset splits or usage was found, other than general training and testing.
Hardware Specification	Yes	The training of the discriminator can be conducted on a NVIDIA V100 GPU and the inference can be conducted on a NVIDIA Ge Force RTX 3090 GPU.
Software Dependencies	No	No specific software versions (e.g., PyTorch 1.9, Python 3.8) were provided for the dependencies, only general names like "Adam optimizer" and "Stable Diffusion".
Experiment Setup	Yes	To train the quality discriminator, we employ the Adam optimizer with a learning rate of 1e-4. We set the training epochs as 100 and the batch size as 128. For the denoising process, we adopt Stable Diffusion [51] as the base model. During sampling, the number of denoising steps is set to T = 50 with a classifier-free guidance of 5. The energy weights for gquality, gdrag and gcontent are set to 1e 3, 4e 4 and 4e 4, respectively.