Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
IterMeme: Expert-Guided Multimodal LLM for Interactive Meme Creation with Layout-Aware Generation
Authors: Yaqi Cai, Shancheng Fang, Yadong Qu, Xiaorui Wang, Meng Shao, Hongtao Xie
IJCAI 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results demonstrate that Iter Meme significantly advances the field of meme creation by delivering consistently highquality outcomes. The code, model, and dataset will be open-sourced to the community. |
| Researcher Affiliation | Collaboration | Yaqi Cai1 , Shancheng Fang2 , Yadong Qu1 , Xiaorui Wang2 , Meng Shao1 , Hongtao Xie1 1University of Science and Technology of China 2Yuan Shi Technology |
| Pseudocode | No | The paper describes methods and processes in narrative text and with mathematical formulations, but does not include any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | The code, model, and dataset will be open-sourced to the community. |
| Open Datasets | Yes | In the first stage, we utilize the Meme Cap dataset introduced in section 3.3, along with a subset of the I2T category from the Oogiri-GO dataset introduced in [Zhong et al., 2024]. |
| Dataset Splits | No | The paper mentions the Meme Cap dataset including 13,977 English samples and 22,457 Chinese samples, and an evaluation dataset of 60 IPs. However, it does not specify explicit training, validation, or test splits (e.g., percentages or counts for each split) for these datasets in the main text. |
| Hardware Specification | No | We acknowledge the support of GPU cluster built by MCC Lab of Information Science and Technology Institution, USTC. We also thank the USTC supercomputing center for providing computational resources for this project. |
| Software Dependencies | Yes | The unified MLLM for understanding and generation is built upon Dream LLM [Dong et al., 2023] as the base model, with SD2.1 [Rombach et al., 2022] serving as the visual decoder. |
| Experiment Setup | Yes | In the first training stage, the Adam W [Loshchilov, 2017] optimizer is employed with a learning rate of 1 10 5 and a weight decay parameter of 0. The batch size is set to 16, and gradient accumulation is performed over 2 steps to optimize computational efficiency. |