Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

LayoutPrompter: Awaken the Design Ability of Large Language Models

Authors: Jiawei Lin, Jiaqi Guo, Shizhao Sun, Zijiang Yang, Jian-Guang Lou, Dongmei Zhang

NeurIPS 2023 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conduct experiments on all existing layout generation tasks using four public datasets. Despite the simplicity of our approach, experimental results show that Layout Prompter can compete with or even outperform state-of-the-art approaches on these tasks without any model training or fine-tuning.
Researcher Affiliation Collaboration Jiawei Lin Xi an Jiaotong University EMAIL Jiaqi Guo Microsoft EMAIL Shizhao Sun Microsoft EMAIL Zijiang James Yang Xi an Jiaotong University EMAIL Jian-Guang Lou Microsoft EMAIL Dongmei Zhang Microsoft EMAIL
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code Yes Our project is available here.
Open Datasets Yes We conduct experiments on 4 datasets, including RICO [27], Pub Lay Net [40], Poster Layout [9] and Web UI [24].
Dataset Splits Yes For RICO and Pub Lay Net, we adopt the same dataset splits as Layout Former++ [15]. While for Poster Layout, the training set includes 9,974 poster-layout pairs, and the remaining 905 posters are used for testing. Regarding the Web UI dataset, we adopt the dataset splits provided by parse-then-place [24].
Hardware Specification No In this work, we conduct experiments on GPT-3 [3] text-davinci-003 model. ... When running GPT-3, we fix the parameters to the default values of the Open AI API...
Software Dependencies No The paper mentions using the 'GPT-3 text-davinci-003 model' and 'Open AI API' but does not list specific version numbers for other software dependencies or libraries.
Experiment Setup Yes We place N = 10 exemplars in the prompt P. For each test sample, we generate L = 10 different outputs ytest. The hyper-parameters involved in the layout ranker module are set to λ1 = 0.2,λ2 = 0.2, and λ3 = 0.6. When running GPT-3, we fix the parameters to the default values of the Open AI API, where the sampling temperature is 0.7 and the penalty-related parameters are set to 0.