Comment on Not likely to be AI-generated or Deepfake

<- View Parent
TheRealKuni@lemmy.world ⁨3⁩ ⁨days⁩ ago

So because you “make” AI generated images you are saying that they are magical and don’t follow the rules of their generation?

That’s what you got from what I wrote?

There’s nothing “magical,” but the variety of AI images that can be produced belies the simplicity of their detection. Which has been my point this whole time.

They are based on noise maps and inferred forwards from there.

There are an infinite number of methods to diffuse noise into an image, and changes to any one of a wild number of variables produces a different image. Even with the same seed and model, different noise samplers can produce entirely different types of images. And there are a LOT of different samplers. And thousands of models.

Then there are millions of LORAs that can add or remove concepts or styles. There are ControlNets that let a generator adjust other features of the image generation, from things like poses to depth mapping to edge smoothing to color noise offsets and many many many more.

The number of tweaks that can be made by someone trying to generate a specific concept is insanely high. And the outputs are wildly different.

I don’t pretend to be an expert in this subject, I’ve barely scratched the surface.

In the video I linked they even talk about how the red blue green maps have the same values cause it started with a colorless pixel anyways. A real sensor doesn’t do that.

No, they give an extremely simple explanation of how noise maps work, and then speak as if it were law, “You’ll never see an AI image that’s mostly dark with a tiny little bit of light or mostly light with a tiny little bit of dark.” Or “You won’t have an AI photo of a flat sunny field with no dark spots.”

But that’s simply not true. It’s nonsense that sounds simple enough to be believable, but the reality isn’t that simple. Each step diffuses further from the initial noise maps. And you can adjust how that happens, whether it focuses more in lighter or darker areas, or in busier or smoother areas.

Just because someone on YouTube says something with confidence doesn’t mean they’re right. YouTubers often scratch the surface of whatever they’re researching to give an overview of the subject because that’s their job. I don’t fault them for it. But they aren’t experts.

(Neither am I, but I know enough to know they don’t know what they’re talking about at depth.)

None of the things they say in that video as though they are law or fact are things that haven’t already been thought of by people who know far more about the subject than these YouTubers (or me).

I did mention earlier that this sort of thing might be true for Dall-E or Midjourney or other cheap/free online services with no settings the user can tweak. AI images generated with as few steps as possible, with as little machine use as possible. They will be easier to spot, more uniform. But those aren’t all there is of AI images.

Another thing to consider: this technology is, at any given moment, at the worst it’s going to be going forward. The leaps and bounds that have been made in image diffusion even in the last year is remarkable. It is currently, sometimes, difficult to detect AI images. As time goes on, it will become harder.

(Which your video example even says.)

source
Sort:hotnewtop