In-game Residential Home Planning via Visual Context-aware Global Relation Learning

Authors: Lijuan Liu, Yin Yang, Yi Yuan, Tianjia Shao, He Wang, Kun Zhou336-343

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Qualitative and quantitative experiments demonstrate that the recommended location well reflects the implicit spatial rules of components in the residential estates, and it is instructive and practical to locate the building units in the 3D scene of the complex construction.
Researcher Affiliation Collaboration 1Net Ease Fuxi AI Lab 2School of Computing, Clemson University 3State Key Lab of CAD&CG, Zhejiang University 4Leeds University liulijuan, yuanyi@corp.netease.com, yin5@clemson.edu, tjshao@zju.edu.cn, H.E.Wang@leeds.ac.uk, kunzhou@acm.org
Pseudocode No The paper includes mathematical equations and descriptions of network components but does not provide any pseudocode or algorithm blocks.
Open Source Code No The paper does not provide any statement about making the source code open, nor does it include any links to a code repository.
Open Datasets No We collected near 150K residential garden plans designed by players from a popular online game, which provides a large area of 165 grid 183 grid (1 grid = 64pixels) and multiple building units of different sizes. This dataset is collected by the authors and not stated to be publicly available, nor is a link or citation provided for access.
Dataset Splits Yes We choose 22.4K designs as the training set and the rest 5.6K are used for testing.
Hardware Specification Yes The model is trained on 4 Titan X 2080 GPUs.
Software Dependencies No For the Conv Net to extract the visual clues for each component, we implement it based on Detectron2 and choose Res Net50 as the backbone. This mentions software by name but does not provide specific version numbers for Detectron2 or the framework used for ResNet50.
Experiment Setup Yes The aspect ratio is set as [0.25, 0.5, 1.0, 2.0, 4.0]. ... The cropped features are then transformed into visual clues with the size of 1,024... node representation with a size of 512. Then we update the node representations 4 rounds... messages with the dimension of 128 and then concatenate the 4-head attention mechanism output... To model the latent dependencies between edges, we set the number of mixtures to 10 in the edge distribution model. During the training phase, we set the batch size as 32 and choose the Adam solver for optimization, with the initial learning rate of lr = 10^-4.