AutomaTikZ: Text-Guided Synthesis of Scientific Vector Graphics with TikZ
Authors: Jonas Belouadi, Anne Lauscher, Steffen Eger
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We fine-tune LLa MA on Da Tik Z, as well as our new model CLi MA, which augments LLa MA with multimodal CLIP embeddings. In both human and automatic evaluation, CLi MA and LLa MA outperform commercial GPT-4 and Claude 2 in terms of similarity to humancreated figures, with CLi MA additionally improving text-image alignment. Our detailed analysis shows that all models generalize well and are not susceptible to memorization. |
| Researcher Affiliation | Academia | Jonas Belouadi Natural Language Learning Group Bielefeld University, Germany jonas.belouadi@uni-bielefeld.de Anne Lauscher Data Science Group University of Hamburg, Germany anne.lauscher@uni-hamburg.de Steffen Eger Natural Language Learning Group University of Mannheim, Germany steffen.eger@uni-mannheim.de |
| Pseudocode | No | The paper describes methods such as iterative resampling but does not include any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | We make our framework, Automa Tik Z, along with model weights and datasets, publicly available.1 https://github.com/potamides/Automa Tik Z |
| Open Datasets | Yes | As part of our Automa Tik Z project, we create Da Tik Z, the first large-scale Tik Z dataset to our knowledge, featuring approximately 120k paired Tik Z drawings and captions. We make our framework, Automa Tik Z, along with model weights and datasets, publicly available.1 https://github.com/potamides/Automa Tik Z |
| Dataset Splits | No | Before fine-tuning our models on Da Tik Z, we extract a sample of 1k human-created items to serve as our test set. The paper does not provide explicit training/validation/test dataset splits (e.g., percentages or counts for each) or reference a specific, predefined split. |
| Hardware Specification | No | The paper mentions 'constraints of our existing GPU resources' but does not provide specific hardware details such as GPU models, CPU models, or detailed cloud/cluster specifications used for running experiments. |
| Software Dependencies | No | The paper mentions software like LLa MA, CLIP, Moses tokenizer, and Adam W, citing their respective papers, but does not provide specific version numbers for these or other ancillary software components used in the experiments. |
| Experiment Setup | Yes | We train for 12 epochs with Adam W (Loshchilov & Hutter, 2019) and a batch size of 128, but increase the learning rate to 5e 4 as this leads to faster convergence. We introduce trainable low-rank adaption weights (Lo RA; Hu et al., 2022) while keeping the base model weights frozen and in half precision (Micikevicius et al., 2018). Following Dettmers et al. (2023), we apply Lo RA to all linear layers. |