Layout Generation as Intermediate Action Sequence Prediction
Authors: Huiting Yang, Danqing Huang, Chin-Yew Lin, Shengfeng He
AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct experiments on three datasets with different types of design, including mobile UI, scientific documents, and slides. Both automatic and human evaluations show that our approach performs consistently better than the baselines. |
| Researcher Affiliation | Collaboration | 1 South China University of Technology 2 Microsoft Research Asia 3 Singapore Management University |
| Pseudocode | No | The paper describes the action schema and derivation rules in paragraph text and mathematical formulas, but does not present them in a structured pseudocode or algorithm block. |
| Open Source Code | Yes | Code is available at this website 1. 1https://github.com/microsoft/KC/tree/main/papers/Layout Action |
| Open Datasets | Yes | We evaluate on three datasets with different types of graphic designs: Rico (Deka et al. 2017; Liu et al. 2018). Pub Lay Net (Zhong, Tang, and Yepes 2019). Info PPT (Shi et al. 2022). |
| Dataset Splits | Yes | For all the datasets, we randomly split the dataset into 85% train, 5% validation, and 10% test. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for running the experiments, such as GPU or CPU models. |
| Software Dependencies | No | The paper mentions the use of AdamW optimizer and the Transformer backbone, and a Python library (python-pptx), but does not specify version numbers for these software dependencies. |
| Experiment Setup | Yes | We set the Transformer with 6 layers of hidden size 512. The number of attention heads is set to 8. We use Adam W optimizer (Loshchilov and Hutter 2017) with initial learning rate of 3e-4, β1 = 0.9, β1 = 0.95 and l2 weight decay of 5e4. We also apply early stopping, gradient clipping (Pascanu, Mikolov, and Bengio 2013), and warm up over the initial 1% training iterations. The dropout rate is set to 0.1. Models are trained with maximum of 50 epochs with batch size 64. During inference, we sample from the multinomial distribution and the top-k sampling (k = 5) for all models. |