Comment

Comment on Developer Creates Infinite Maze That Traps AI Training Bots

This looks interesting. I’d probably combine it with model poisoning - giving each page longer chunks of text, containing bullshit claims and “grammar of slightly brokenness”; so if the data is used to train a model with, the result gets worse.

source

Sort:hotnew top

jherazob@beehaw.org ⁨9⁩ ⁨months⁩ ago
By now i’ve seen like 6 or 7 projects on either trapping or outright poison content for LLM bots, and yesterday i saw this one which outright modifies the HTML of your page making it harder to steal and instead replaces it with a random prompt, someone asks it to summarize a blogpost and instead the thing starts talking about poodles or something, while normal browser users notice nothing

source
- lvxferre@mander.xyz ⁨9⁩ ⁨months⁩ ago
  Yup, something like this - but for the honeypot, not for the legit pages.
  
  source
  - jherazob@beehaw.org ⁨9⁩ ⁨months⁩ ago
    I think many of them already do, i know Iocaine uses Markov chain text generation to spit out nonsense to poison the LLMs, do check it, at the bottom of the project page it links to others
    
    source