Comment

Comment on You Actually Do Need to Understand Mythos | Hank Green

MissesAutumnRains@lemmy.blahaj.zone ⁨3⁩ ⁨months⁩ ago

In their paper, they post keys that can be verified once the vulnerabilities are patched (so they aren’t just revealing exploitable issues to the world) but in the few that they demonstrated (ones that were quickly patched), it demonstrated a pretty sophisticated ability to find and exploit multiple vulnerabilities. The patches that you saw them mention are a direct result of Anthropic reporting those vulnerabilities.

The method they talk about is basically saying that they weren’t looking at old, patched code (which would mean that the model could have found vulnerability mentions on the web that others have pointed out) but rather current, actively used software. The vulnerabilities and exploits that the model found were novel, zero day (meaning as of yet they are unexploited, ‘undiscovered’ problems).

I’m not a researcher though, so someone can correct any information I’ve gotten wrong here, but this is definitely not solely hype. It’s not exciting stuff (unless you just look at headlines) but the vulnerabilities they discovered are like actual problems, especially if a model like this gets into the hands of bad actors.

source

Sort:hotnew top

MrKurteous@feddit.nu ⁨3⁩ ⁨months⁩ ago
Ah thanks, I didn’t find their paper but you lead me on the correct path to find some nice info on their blog! Great idea with the keys they had, it’s good that we will be able to verify if their claims are true in the future at least. The bugs that were solved already did indeed seem cool, but they write the blog in a slightly odd day where I didn’t find the confirmation that those were also zero-day vulnerabilities. Either way, we should get plenty of confirmation with the keys. Thanks for the details!

source