OmniSVG: A Unified Scalable Vector Graphics Generation Model

Submitted ⁨⁨4⁩ ⁨months⁩ ago⁩ by ⁨Even_Adder@lemmy.dbzer0.com⁩ to ⁨stable_diffusion@lemmy.dbzer0.com⁩

https://omnisvg.github.io/assets/OmniSVG-main-demo-1080.mp4

Abstract

Scalable Vector Graphics (SVG) is an important image format widely adopted in graphic design because of their resolution independence and editability. The study of generating high-quality SVG has continuously drawn attention from both designers and researchers in the AIGC community. However, existing methods either produces unstructured outputs with huge computational cost or is limited to generating monochrome icons of over-simplified structures. To produce high-quality and complex SVG, we propose OmniSVG, a unified framework that leverages pre-trained Vision-Language Models (VLMs) for end-to-end multimodal SVG generation. By parameterizing SVG commands and coordinates into discrete tokens, OmniSVG decouples structural logic from low-level geometry for efficient training while maintaining the expressiveness of complex SVG structure. To further advance the development of SVG synthesis, we introduce MMSVG-2M, a multimodal dataset with two million richly annotated SVG assets, along with a standardized evaluation protocol for conditional SVG generation tasks. Extensive experiments show that OmniSVG outperforms existing methods and demonstrates its potential for integration into professional SVG design workflows.

Paper: arxiv.org/abs/2504.06263

Code: github.com/OmniSVG/OmniSVG/

Weights: huggingface.co/OmniSVG/OmniSVG

Project Page: omnisvg.github.io

Image

source

Comments

Sort:hotnew top

rizzothesmall@sh.itjust.works ⁨4⁩ ⁨months⁩ ago
I am very into this if it can take a non-vector graphic as input and work to that. OpenAI’s attempts at that have been complete dickfarts

source
- paraphrand@lemmy.world ⁨4⁩ ⁨months⁩ ago
  This is the first time I’ve seen a model target SVG drafting. Anything you have seen previously about unicorns or whatever was just someone experimenting with interesting edge case usage.
  
  Feeding a language model a bunch of vector art does not seem productive to me. So it makes sense that something like GPT4 sucks at it.
  
  source
- Even_Adder@lemmy.dbzer0.com ⁨4⁩ ⁨months⁩ ago
  It can do IMG to SVG. Check out the right side of this image:
  
  Image
  
  source
  - GenderNeutralBro@lemmy.sdf.org ⁨4⁩ ⁨months⁩ ago
    Hard to judge quality when what we’re seeing is practically a pixel-perfect recreation. The tricky part of automated vectorization is detecting and plotting curves in such a way that it scales correctly. Bad implementations will use too many elements, or include straight lines that should be parts of curves, etc. Those errors would not be visible in those low-res rasterizations.
    
    source
    -> View More Comments
TropicalDingdong@lemmy.world ⁨4⁩ ⁨months⁩ ago
[creams self in simple features]

source
- Even_Adder@lemmy.dbzer0.com ⁨4⁩ ⁨months⁩ ago
  Your name is crazy. 🤣
  
  source
  - TropicalDingdong@lemmy.world ⁨4⁩ ⁨months⁩ ago
    I want to fine tune this model on large geospatial datasets.
    
    source
outhouseperilous@lemmy.dbzer0.com ⁨4⁩ ⁨months⁩ ago
Okay, you let me tie this into a soreadsheet ir something to geberate charts, and there’s finally a use case for this that i like.

Im not sure it’s worth needing a 5080 to make ultra pretty graphs, but, you know; smoke em if you got em.

source