SphereDiffusion: Spherical Geometry-Aware Distortion Resilient Diffusion Model

Authors: Tao Wu, Xuewei Li, Zhongang Qi, Di Hu, Xintao Wang, Ying Shan, Xi Li

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results on the Structured3D dataset (Zheng et al. 2020) demonstrate that our method can significantly improve the quality of controllable spherical image generation and relatively reduces around 35% FID on average compared to previous methods.
Researcher Affiliation Collaboration Tao Wu 1*, Xuewei Li 1*, Zhongang Qi 2 , Di Hu3, Xintao Wang2, Ying Shan2, Xi Li 1,4 1College of Computer Science and Technology, Zhejiang University 2ARC Lab, Tencent PCG 3Gaoling School of Artificial Intelligence, Renmin University of China 4Zhejiang Singapore Innovation and AI Joint Research Lab, Hangzhou
Pseudocode No The paper describes methods and formulas (e.g., LLDM, LC, D, Lall) but does not present any structured pseudocode or algorithm blocks.
Open Source Code No The paper does not provide any explicit statement or link regarding the availability of its source code.
Open Datasets Yes We evaluated our model on the Structured3D dataset (Zheng et al. 2020), which provides 196k spherical panoramic images of 21,835 rooms in 3,500 scenes.
Dataset Splits No The paper states "We use scene 00000 to scene 03249 for training, and scene 03250 to scene 03499 for testing" but does not specify a separate validation split or its details.
Hardware Specification Yes Our experiments are conducted with a server with eight NVIDIA A100 GPUs, and training epochs are 20.
Software Dependencies No The paper mentions "Stable Diffusion 1.5", "CLIP", and "BLIP" but does not provide specific version numbers for other required software dependencies or libraries.
Experiment Setup Yes Our experiments are conducted with a server with eight NVIDIA A100 GPUs, and training epochs are 20. We set (αc, βc, γc) = (360 , 3 , 3 ) and (αd, βd, γd) = (360 , 10 , 10 ). λ / N / K are set to 0.1 / 50 / 4, respectively.