SphereDiffusion: Spherical Geometry-Aware Distortion Resilient Diffusion Model
Authors: Tao Wu, Xuewei Li, Zhongang Qi, Di Hu, Xintao Wang, Ying Shan, Xi Li
AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results on the Structured3D dataset (Zheng et al. 2020) demonstrate that our method can significantly improve the quality of controllable spherical image generation and relatively reduces around 35% FID on average compared to previous methods. |
| Researcher Affiliation | Collaboration | Tao Wu 1*, Xuewei Li 1*, Zhongang Qi 2 , Di Hu3, Xintao Wang2, Ying Shan2, Xi Li 1,4 1College of Computer Science and Technology, Zhejiang University 2ARC Lab, Tencent PCG 3Gaoling School of Artificial Intelligence, Renmin University of China 4Zhejiang Singapore Innovation and AI Joint Research Lab, Hangzhou |
| Pseudocode | No | The paper describes methods and formulas (e.g., LLDM, LC, D, Lall) but does not present any structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide any explicit statement or link regarding the availability of its source code. |
| Open Datasets | Yes | We evaluated our model on the Structured3D dataset (Zheng et al. 2020), which provides 196k spherical panoramic images of 21,835 rooms in 3,500 scenes. |
| Dataset Splits | No | The paper states "We use scene 00000 to scene 03249 for training, and scene 03250 to scene 03499 for testing" but does not specify a separate validation split or its details. |
| Hardware Specification | Yes | Our experiments are conducted with a server with eight NVIDIA A100 GPUs, and training epochs are 20. |
| Software Dependencies | No | The paper mentions "Stable Diffusion 1.5", "CLIP", and "BLIP" but does not provide specific version numbers for other required software dependencies or libraries. |
| Experiment Setup | Yes | Our experiments are conducted with a server with eight NVIDIA A100 GPUs, and training epochs are 20. We set (αc, βc, γc) = (360 , 3 , 3 ) and (αd, βd, γd) = (360 , 10 , 10 ). λ / N / K are set to 0.1 / 50 / 4, respectively. |