Comment on You Actually Do Need to Understand Mythos | Hank Green

<- View Parent
MissesAutumnRains@lemmy.blahaj.zone ⁨1⁩ ⁨week⁩ ago

In their paper, they post keys that can be verified once the vulnerabilities are patched (so they aren’t just revealing exploitable issues to the world) but in the few that they demonstrated (ones that were quickly patched), it demonstrated a pretty sophisticated ability to find and exploit multiple vulnerabilities. The patches that you saw them mention are a direct result of Anthropic reporting those vulnerabilities.

The method they talk about is basically saying that they weren’t looking at old, patched code (which would mean that the model could have found vulnerability mentions on the web that others have pointed out) but rather current, actively used software. The vulnerabilities and exploits that the model found were novel, zero day (meaning as of yet they are unexploited, ‘undiscovered’ problems).

I’m not a researcher though, so someone can correct any information I’ve gotten wrong here, but this is definitely not solely hype. It’s not exciting stuff (unless you just look at headlines) but the vulnerabilities they discovered are like actual problems, especially if a model like this gets into the hands of bad actors.

source
Sort:hotnewtop