reproducibilityindex.ai

InstructScene: Instruction-Driven 3D Indoor Scene Synthesis with Semantic Graph Prior

Authors: Chenguo Lin, Yadong MU

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experimental results reveal that the proposed method surpasses existing state-of-the-art approaches by a large margin. Thorough ablation studies confirm the efficacy of crucial design components.
Researcher Affiliation	Academia	Chenguo Lin, Yadong Mu Peking University chenguolin@stu.pku.edu.cn, myd@pku.edu.cn
Pseudocode	No	No explicit pseudocode or algorithm blocks found.
Open Source Code	Yes	Project page: https://chenguolin.github.io/projects/Instruct Scene. Our instruction-scene pair dataset and code for both training and evaluation can be found in https://chenguolin.github.io/projects/Instruct Scene.
Open Datasets	Yes	To fit practical scenarios and promote the benchmarking of instruction-drive scene synthesis, we curate a high-quality dataset containing paired scenes and instructions with the help of large language and multimodal models (Li et al., 2022; Ouyang et al., 2022; Open AI, 2023). Our instruction-scene pair dataset and code for both training and evaluation can be found in https://chenguolin.github.io/projects/Instruct Scene.
Dataset Splits	Yes	We use the same data split for training and evaluation as ATISS (Paschalidou et al., 2021).
Hardware Specification	Yes	our method takes about 12 seconds to generate a batch of 128 living rooms by our method on a single A40 GPU.
Software Dependencies	No	The paper mentions software like Open Shape, CLIP, Blender, and clean-fid library, but does not provide specific version numbers for these software dependencies.
Experiment Setup	Yes	We use 5-layer and 8-head Transformers with 512 attention dimensions and a dropout rate of 0.1 for all generative models in this work. They are trained by the Adam W optimizer (Loshchilov & Hutter, 2018) for 500,000 iterations with a batch size of 128, a learning rate of 1e-4, and a weight decay of 0.02. Exponentially moving average (EMA) technique (Polyak & Juditsky, 1992; Ho et al., 2020) with a decay factor of 0.9999 is utilized in the model parameters.