SceneCraft: An LLM Agent for Synthesizing 3D Scenes as Blender Code

Authors: Ziniu Hu, Ahmet Iscen, Aashi Jain, Thomas Kipf, Yisong Yue, David A Ross, Cordelia Schmid, Alireza Fathi

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our evaluation demonstrates that Scene Craft surpasses existing LLM-based agents in rendering complex scenes, as shown by its adherence to constraints and favorable human assessments. We conduct comprehensive experiments on both synthetic and real-world datasets.
Researcher Affiliation Collaboration 1California Institute of Technology 2Google Deep Mind.
Pseudocode Yes The pseudo-code of the whole dual-loop learning is illustrated in Alg 1.
Open Source Code No The paper does not provide a concrete statement or a specific link to the source code for the methodology described in this paper.
Open Datasets Yes For the Sintel movie dataset... Sintel Movie, which is an animated fantasy short film produced with Blender, where scripts and Blender scenes are open sourced4. We download all these scenes, using the first half as the training set and the remaining half for testing. 4https://studio.blender.org/films/sintel/
Dataset Splits No The paper states 'using the first half as the training set and the remaining half for testing' but does not explicitly mention a separate validation split or its proportions.
Hardware Specification No The paper does not provide specific hardware details (e.g., exact GPU/CPU models, processor types with speeds, or memory amounts) used for running its experiments.
Software Dependencies No The paper mentions software components like Blender, GPT-4V, Video Poet, and txtai but does not provide specific version numbers for these dependencies.
Experiment Setup Yes In our refinement algorithm, the average number of iterations is hard-coded as 4 steps without early stopping. The maximum number of subproblems of the current system is 7, so the maximum number of tokens for each scene design is around 15k, and the average is 6k.