reproducibilityindex.ai

Autoregressive Omni-Aware Outpainting for Open-Vocabulary 360-Degree Image Generation

Authors: Zhuqiang Lu, Kun Hu, Chaoyue Wang, Lei Bai, Zhiyong Wang

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Comprehensive experiments on two commonly used 360-degree image datasets for both indoor and outdoor settings demonstrate the state-of-the-art performance of our proposed method.
Researcher Affiliation	Collaboration	Zhuqiang Lu1, Kun Hu1,*, Chaoyue Wang2, Lei Bai3, Zhiyong Wang1 1The University of Sydney 2JD.com 3Shanghai AI Laboratory
Pseudocode	No	The paper does not contain any clearly labeled pseudocode or algorithm blocks.
Open Source Code	Yes	Our code is available at https://github.com/zhuqiang Lu/AOG-NET-360.
Open Datasets	Yes	we evaluate our proposed method with the LAVAL indoor HDR dataset (Gardner et al. 2017) for the 360-degree indoor image generation setting...For the outdoor setting, we utilize the LAVAL outdoor HDR dataset (Zhang and Lalonde 2017)
Dataset Splits	No	we used the official training and testing split in our experiments, in which we have 1,921 training samples and 312 testing samples. For the outdoor setting, we randomly sample 170 images as the training split and 40 images for testing purpose.
Hardware Specification	Yes	All experiments were conducted on an Nvidia RTX 3090.
Software Dependencies	No	In our experiment, we adopted the pretrained Stable Diffusion generative prior for each autoregressive generation step. In addition, We utilized the visual encoder and the text encoder of Open CLIP (Cherti et al. 2023) for E360 and Etext, respectively. We utilized T2I-Adapter (Mou et al. 2023) as the architecture for NFo V guidance encoder ENFo V and omnigeometry guidance encoder Egeometry.
Experiment Setup	Yes	AOG-Net was trained using an Adam W optimizer (Loshchilov and Hutter 2019) with β1 = 0.9 and β2 = 0.999. It was trained for 240 epochs, with learning rate 1 × 10−4 and batch size 1. For inference, we leveraged DPM-Solver++ (Lu et al. 2023) as sampler with a step set to 25 and classifier-free-guidance (Ho and Salimans 2022) scale set to 2.5.