PanoDiffusion: 360-degree Panorama Outpainting via Diffusion
Authors: Tianhao Wu, Chuanxia Zheng, Tat-Jen Cham
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Results show that our Pano Diffusion not only significantly outperforms state-of-the-art methods on RGB-D panorama outpainting by producing diverse well-structured results for different types of masks, but can also synthesize high-quality depth panoramas to provide realistic 3D indoor models. 4 EXPERIMENTS |
| Researcher Affiliation | Academia | Tianhao Wu1 , Chuanxia Zheng2 & Tat-Jen Cham1 S-Lab, 1Nanyang Technological University tianhao001@e.ntu.edu.sg, astjcham@ntu.edu.sg 2University of Oxford cxzheng@robots.ox.ac.uk |
| Pseudocode | No | The paper does not contain a clearly labeled 'Pseudocode' or 'Algorithm' block. |
| Open Source Code | Yes | The supplementary materials are organized as follows: 1. A video is added to illuminate the work with more results. 2. The reproducible code is included. 3. An additional PDF for implementation, training, metrics details, as well as more quantitative and qualitative results. |
| Open Datasets | Yes | Dataset. We estimated our model on the Structured3D dataset (Zheng et al., 2020), which provides 360 indoor RGB-D data following equirectangular projection with a 512 1024 resolution. |
| Dataset Splits | Yes | We split the dataset into 16930 train, 2116 validation, and 2117 test instances. |
| Hardware Specification | No | The paper mentions 'on the same devices' for training time comparison but does not provide specific hardware details such as GPU/CPU models or memory. |
| Software Dependencies | No | The paper does not provide specific version numbers for software dependencies. It mentions PyTorch and CUDA indirectly through reference to official implementations or pre-trained models, but without versions. |
| Experiment Setup | Yes | For training G, we use a weighted sum of the pixel-wise L1 loss and adversarial loss. The pixel-wise L1 loss is denoted as Lpixel, measuring the difference between the GT and the output panorama. ... Here the value of λ is set to 20 during the training. The training of VAEs is exactly the same as in (Rombach et al., 2022) with downsampling factor f=4. |