Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

OmniSVG: A Unified Scalable Vector Graphics Generation Model

Authors: Yiying Yang, Wei Cheng, Sijin Chen, Xianfang Zeng, Fukun Yin, Jiaxu Zhang, Liao Wang, Gang Yu, Xingjun Ma, Yu-Gang Jiang

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments show that Omni SVG outperforms existing methods and demonstrates its potential for integration into professional SVG design workflows. To validate the effectiveness of our method, we first introduce the baselines (Sec. 5.1). Then, we make quantitative comparisons with prior arts (Secs. 5.2 and 5.3) and conduct ablations (Sec. 5.4) to study the effectiveness of our design.
Researcher Affiliation Collaboration Yiying Yang1,2 Wei Cheng2 Sijin Chen1 Xianfang Zeng2 Fukun Yin1,2 Jiaxu Zhang2 Liao Wang2 Gang Yu2 Xingjun Ma1 Yu-Gang Jiang1 1 Fudan University 2 Step Fun
Pseudocode No The paper describes methods and processes using descriptive text and mathematical equations, but it does not contain any clearly labeled pseudocode or algorithm blocks.
Open Source Code Yes Answer: [Yes] Justification: We provide access to the data and evaluation code for reproduction.
Open Datasets Yes To further advance the development of SVG synthesis, we introduce MMSVG-2M, a multimodal dataset with two million richly annotated SVG assets, along with a standardized evaluation protocol for conditional SVG generation tasks. Answer: [Yes] Justification: We provide access to the data and evaluation code for reproduction.
Dataset Splits Yes Table 6: Data Statistics for MMSVG-2M. Our MMSVG-2M consists of 1.1 million SVG icons, 0.5 million SVG illustrations, and 0.4 million SVG anime characters. Dataset Train Val Total Source Token Length MMSVG-Icon 990k 110k 1,100k Iconfont 2.2k 0.9k MMSVG-Illustration 450k 50k 500k Icon Scout 8.1k 3.3k MMSVG-Character 350k 50k 400k Freepik & generated 28k 7.3k
Hardware Specification No The computations in this research were performed using the CFFF platform of Fudan University. This mentions a platform but no specific hardware like GPU/CPU models or memory.
Software Dependencies No We train our models in bfloat16 with the Ze RO2 strategy [40] for memory-efficient training. We also adopt the Adam W [33] optimizer with a learning rate decaying from 3 10 4 to 3 10 6 and a weight decay of 0.1 to train our model. In practice, we load the pre-trained weights from the Qwen2.5-VL [1] model and initialize the SVG embeddings from scratch. While specific strategies and models are mentioned, no general software (e.g., Python, PyTorch) with version numbers is listed.
Experiment Setup Yes We train our models in bfloat16 with the Ze RO2 strategy [40] for memory-efficient training. We also adopt the Adam W [33] optimizer with a learning rate decaying from 3 10 4 to 3 10 6 and a weight decay of 0.1 to train our model. In practice, we load the pre-trained weights from the Qwen2.5-VL [1] model and initialize the SVG embeddings from scratch. Without further specification, we generate SVGs with the top-k and top-p sampling strategy with k = 50 and p = 0.95 for diversity.