Comment

Comment on Mozilla Firefox new alt-text generator powered by "fully private on-device AI model"

Now i want this standalone in a commandline binary, take an image and give me a single phrase description (gut feeling says this already exists but depending on Teh Cloudz and OpenAI, not fully local on-device for non-GPU-powered computers)

source

Sort:hotnew top

umami_wasbi@lemmy.ml ⁨1⁩ ⁨year⁩ ago
Ollama + llava-llama3

You now just need a cli wrapper interact with the ollama api

source
- jherazob@beehaw.org ⁨1⁩ ⁨year⁩ ago
  So, it’s possible to build but no one has made it yet? Because i have negative interest in messing with that kinda tech, and would rather just “apt-get install whatever-image-describing-gizmo” so i wouldn’t be the one who does it
  
  source
  - Swedneck@discuss.tchncs.de ⁨1⁩ ⁨year⁩ ago
    this is how i feel about basically all technology nowadays, it’s all so artificially limited by capitalism.
    
    nothing fucking progresses unless someone figures out a way to monetize it or an autistic furry decides to revolutionize things in a weekend because they were bored and inventing god was almost stimulating enough
    
    source
  - drwho@beehaw.org ⁨1⁩ ⁨year⁩ ago
    Folks have made it - I think ollama was name-checked specifically because it’s on Github and in Homebrew and in some distros’ package repositories (it’s definitely in Arch’s). I think some folks (at least) aren’t talking about it because of the general hate-on folks have for LLMs these days.
    
    source
    jherazob@beehaw.org ⁨1⁩ ⁨year⁩ ago
    I don’t want an LLM to chat with or whatever folks do with those things, i want a command i can just install, i call the binary on a terminal window with an image of some sort as a parameter, it returns a single phrase describing the image, on a typical office machine with no significant GPU and zero internet access.
    
    Right now i cannot do this as far as i know. Pointing me at some LLM and “Go build yourself something with that” is the direct opposite of what i stated that i desire. So, it doesn’t currently seem to exist, that’s why i stated that i wished somebody ripped it off the Firefox source and made it a standalone command.
    
    source
    -> View More Comments
- Zworf@beehaw.org ⁨1⁩ ⁨year⁩ ago
  Yes I was just writing that, I would love to see more integrations that can talk against ollama.
  
  source
Marsupial@quokk.au ⁨1⁩ ⁨year⁩ ago
Any multimodal llm could do this in a heart beat locally.

And OpenAI has made their shit freely available to run locally, it’s like the worst company to use as an example.

source
- photonic_sorcerer@lemmy.dbzer0.com ⁨1⁩ ⁨year⁩ ago
  Where is this freely available multimodal GPT4 you speak of?
  
  source