ChatGPT's o3 Model Found Remote Zeroday in Linux Kernel Code

⁨65⁩ ⁨likes⁩

Submitted ⁨⁨5⁩ ⁨months⁩ ago⁩ by ⁨Kissaki@beehaw.org⁩ to ⁨technology@beehaw.org⁩

https://linuxiac.com/chatgpt-o3-model-found-remote-zeroday-in-linux-kernel-code/

source

Comments

Sort:hotnew top

knokelmaat@beehaw.org ⁨5⁩ ⁨months⁩ ago
I hate AI. Why?

Because of its extreme energy consumption compared to what it achieves

Because it is all in the hands of the worst companies on this planet

Because capitalists are foaming at the mouth to use it to fuck over workers

Because it is devaluing art and reducing it to another commodity to “produce”

However

I also took the time to read the original blog post, and it is a fascinating story.

The author starts out with using an existing vulnerability as a benchmark for ChatGPT testing. They describe how they took the code specific to the vulnerability and packaged it for ChatGPT, how they formatted the query and what their results were. In 100 runs only 8 correctly identify the targeted vulnerability, the rest are false positives or claim that there are no vulnerabilities in the given code.

Then they take their test a step further and increase the amount of code shared with ChatGPT so that it also includes stuff of the module that had nothing to do with the original vulnerability. As expected, this larger input decreases performance and also reduces the vulnerability detection rate for the targeted vulnerability. However, in those 100 runs, another vulnerability was described that wasn’t a false positive. An actual new vulnerability that the author didn’t know about was discovered. Again, the signal to noise ratio is very low, and one has to sift through a lot of wrong reports to get a realistic one, but this proved that it could be used as a useful tool for helping to detect vulnerabilities.

I highly recommend reading the blog post.

As much as I like to be critical about AI, it doesn’t help if we put our heads in the sand and act as if it never does something cool.
source
- shnizmuffin@lemmy.inbutts.lol ⁨5⁩ ⁨months⁩ ago
  
  In 100 runs only 8 correctly identify the targeted vulnerability, the rest are false positives or claim that there are no vulnerabilities in the given code. … [The] signal to noise ratio is very low, and one has to sift through a lot of wrong reports to get a realistic one.
  
  It was right 8% of the time when presented the least amount of input to find a known bug. Then, when they opened it up to more of the codebase, its performance decreased.
  
  I’m not going to use something that’s wrong over 92% of the time. That’s insane. That’s like saying my Magic 8 Ball “could be used as a useful tool for helping to detect vulnerabilities.” The fucking rubber ducky on my desk has a more reliable clearance rate.
  
  source
  - knokelmaat@beehaw.org ⁨5⁩ ⁨months⁩ ago
    This is literally the very first experiment in this use case, done by a single person on a model that wasn’t specifically designed for this. The fact that it is able to formulate a correct response at all in this situation impresses me.
    
    It would be easy to criticize this if it were the endpoint and this was being advertised as a tool for vulnerability research, but as discussed at the end of the post, this “quick little test” shows both initial promising results and had the fortunate byproduct of actually revealing a new vulnerability. By no means is it implied that it is now ready for use in this field.
    
    The issue with hallucinations is one that in my opinion is never going to be totally fixed. That is why I hate the use of AI as a final arbiter of truth, which is sadly how a lot of people use it (I’ll quickly ask ChatGPT) and companies advertise it. What it is good at however, is coming up with plausible ideas, and in this case having an indication for things to check in code can be a great tool to discover new stuff, as is literally the case for this security researcher finding a new vulnerability after auditing the module themselves.
    
    source
    -> View More Comments
- LukeZaz@beehaw.org ⁨5⁩ ⁨months⁩ ago
  Interesting. I feel like the headline is still bad though. I get why they ran with it, at least — “ChatGPT finds kernel exploit” is more interesting and gets more clicks than “Monkey finally writes Shakespeare.”
  
  source
thomasembree@me.dm ⁨5⁩ ⁨months⁩ ago
@Kissaki In another thread, people are mocking AI because the free language models they are using are bad at drawing accurate maps. "AI can't even do geography". Anything an AI says can't be trusted, and AI is vastly inferior to human ability.
These same people haven't figured out the difference between using a language AI to draw a map, and simply asking it a geography question.
source
- FozzyOsbourne@lemm.ee ⁨5⁩ ⁨months⁩ ago
  [deleted]
  source
  - ChairmanMeow@programming.dev ⁨5⁩ ⁨months⁩ ago
    I think the point is that even if LLMs suck at task A, they might be really good at task B. Just because code written by LLMs is often riddled with security flaws, doesn’t mean LLMs also suck at identifying those flaws.
    
    source
    -> View More Comments
- callouscomic@lemm.ee ⁨5⁩ ⁨months⁩ ago
  Your comment made me very curious, and I dunno if this is hilarious or disappointing.
  
  Image
  
  source
- 2xsaiko@discuss.tchncs.de ⁨5⁩ ⁨months⁩ ago
  Daniel Stenberg has banned AI-edited bug reports from cURL because they were exclusively nonsense and just wasted their time. Just because it gets a hit once doesn’t mean it’s good at this either.
  
  source
  - Kissaki@beehaw.org ⁨5⁩ ⁨months⁩ ago
    It does show that it can be a useful tool, though.
    
    Here, the security researcher was evaluating it and stumbled upon a previously undiscovered security bug. Obviously, they didn’t let the AI create the bug report without understanding it. They verified the answer and took action themselves, presumably analyzing, verifying, and reporting in a professional and respectful way.
    
    The cURL AI spam is an issue at the opposite side of that. But doesn’t really tell us anything about capabilities. It tells us more about people. In my eyes, at least.
    
    source
    -> View More Comments
- apotheotic@beehaw.org ⁨5⁩ ⁨months⁩ ago
  Like, I get that there’s people who are mocking AI for the wrong reasons, and they’re silly for that, but there are very real reasons to dislike AI in many applications.
  
  Would chatgpt be able to do this if their dataset had consisted only of ethically obtained data where the authors had provided consent? My money is on no, at least not yet. The technology is in its infancy and has powerful potential, but is having its progress boosted through highly unethical means.
  
  I’m so very much for the concept of AI, its a monumental technology space at its core. But it needs to be done right, and I fear that it never will be, and we will have to live with the sins of the existing models forever. I hope I will be wrong.
  
  If we can reach a future where models are trained on entirely consensual data and the environmental impact of their training and usage isn’t as dire, I’d be so happy.
  
  source
  - thomasembree@me.dm ⁨5⁩ ⁨months⁩ ago
    @apotheotic The issue with copyright is an inevitable misstep that was bound to happen while figuring out this technology. However, some of criticisms aren't about ethical issues surrounding copyright, they are about the marketability of skills (such as painting) that you either had to learn yourself or otherwise needed to pay someone to do for you.
    Now you can do that with an AI. Great for disabled people who can create freely now, bad for the artists who exploited that for financial gain.
    source
    -> View More Comments
  - thomasembree@me.dm ⁨5⁩ ⁨months⁩ ago
    @apotheotic As for things like creating images in the style of a specific artist, that is not plagiarism unless you are asking for a perfect replica of a specific art piece and claiming it as your own original work.
    All artists imitate the styles they find appealing, if you paint a Van Gogh style painting it isn't plagiarism of Van Gogh. Likewise, if I were to imitate Van Gogh's style using an AI, the resulting image would be my original work and not Van Gogh's creation.
    source
    -> View More Comments
- jarfil@beehaw.org ⁨5⁩ ⁨months⁩ ago
  There are 10 kinds of people: those who think they understand neural networks, those who try to understand neural networks, and those whose neural networks can’t spot the difference.
  
  Not a coincidence the amount of people who are bad at languages, communication, learning, or teaching. On the bright side, new generations are likely to be forced to get better.
  
  source
  - thomasembree@me.dm ⁨5⁩ ⁨months⁩ ago
    @jarfil I think it's unavoidable instict. In our ancestral environment, it was basic survival sense to fear the unknown and assume it could be dangerous. Caution just makes sense in that scenario.
    There hasn't been enough time for our genes to adapt to our new, radically different environment. So people will continue to react to technological advances as if a tiger could leap out at any moment and maul them to death. Even I experience a vague unease, and I love technology.
    source