Abstract
Scalable Vector Graphics (SVG) is an important image format widely adopted in graphic design because of their resolution independence and editability. The study of generating high-quality SVG has continuously drawn attention from both designers and researchers in the AIGC community. However, existing methods either produces unstructured outputs with huge computational cost or is limited to generating monochrome icons of over-simplified structures. To produce high-quality and complex SVG, we propose OmniSVG, a unified framework that leverages pre-trained Vision-Language Models (VLMs) for end-to-end multimodal SVG generation. By parameterizing SVG commands and coordinates into discrete tokens, OmniSVG decouples structural logic from low-level geometry for efficient training while maintaining the expressiveness of complex SVG structure. To further advance the development of SVG synthesis, we introduce MMSVG-2M, a multimodal dataset with two million richly annotated SVG assets, along with a standardized evaluation protocol for conditional SVG generation tasks. Extensive experiments show that OmniSVG outperforms existing methods and demonstrates its potential for integration into professional SVG design workflows.
Paper: arxiv.org/abs/2504.06263
Code: github.com/OmniSVG/OmniSVG/
Weights: huggingface.co/OmniSVG/OmniSVG
Project Page: omnisvg.github.io
rizzothesmall@sh.itjust.works 6 days ago
I am very into this if it can take a non-vector graphic as input and work to that. OpenAI’s attempts at that have been complete dickfarts
paraphrand@lemmy.world 6 days ago
This is the first time I’ve seen a model target SVG drafting. Anything you have seen previously about unicorns or whatever was just someone experimenting with interesting edge case usage.
Feeding a language model a bunch of vector art does not seem productive to me. So it makes sense that something like GPT4 sucks at it.
Even_Adder@lemmy.dbzer0.com 6 days ago
It can do IMG to SVG. Check out the right side of this image:
Image
GenderNeutralBro@lemmy.sdf.org 6 days ago
Hard to judge quality when what we’re seeing is practically a pixel-perfect recreation. The tricky part of automated vectorization is detecting and plotting curves in such a way that it scales correctly. Bad implementations will use too many elements, or include straight lines that should be parts of curves, etc. Those errors would not be visible in those low-res rasterizations.