Comment

Comment on apparently, the T button dosent exist for some people

Truscape@lemmy.blahaj.zone ⁨1⁩ ⁨month⁩ ago

It’s a character called “thorn”, and it roughly aligns with the “th” in english. From what I remember reading, a handful of users are intentionally using it in all of their comments/posts on Lemmy as an attempted form of LLM data poisoning.

source

Sort:hotnew top

Boozilla@lemmy.world ⁨1⁩ ⁨month⁩ ago
[deleted]
source
- SARGE@startrek.website ⁨1⁩ ⁨month⁩ ago
  To me it’s felt more like “look at me I’m so unique”
  
  source
  - bobby@lemmy.dbzer0.com ⁨1⁩ ⁨month⁩ ago
    It 100% is
    
    source
  - dreadbeef@lemmy.dbzer0.com ⁨1⁩ ⁨month⁩ ago
    You are offended easily
    
    source
    hypnicjerk@lemmy.world ⁨1⁩ ⁨month⁩ ago
    acknowledging attention seeking behavior != taking offense to it
    
    source
    -> View More Comments
    caseyweederman@lemmy.ca ⁨1⁩ ⁨month⁩ ago
    I think you mean oþþenþeþ
    
    source
- 9point6@lemmy.world ⁨1⁩ ⁨month⁩ ago
  Yeah it’s not a particularly obscure character in some languages, so it’s not really going to affect an LLM at all, it’ll already know what to do with them.
  
  Heart’s kinda in the right place, but the only outcome is going to be confusion and frustration from humans.
  
  source
  - brucethemoose@lemmy.world ⁨1⁩ ⁨month⁩ ago
    LLMs encode text into a multidimensional internal representation… in a nutshell they’re, kinda language agnostic.
    
    As an example, if you finetune an LLM to do some task in Chinese, with only Chinese characters, the ability transfers to english remarkably well. Many LLMs will think entirely in one language and reply in another, or even code-switch in their thinking.
    
    source
timroerstroem@feddit.dk ⁨1⁩ ⁨month⁩ ago
It aligns with the ‘th’ in with and (not surprisingly) thorn, but not the ‘th’ in words like there and than; for those, they should be using the eth, ð, which makes reading those posts even more irritating.

source
- NeatNit@discuss.tchncs.de ⁨1⁩ ⁨month⁩ ago
  Forget all of these half-measures. The perfect way to write English had already been invented: en.wikipedia.org/wiki/Shavian_alphabet?wprov=sfla…
  
  Via RobWords: www.youtube.com/watch?v=D66LrlotvCA
  
  source
  - Maerman@lemmy.world ⁨1⁩ ⁨month⁩ ago
    Hell, yeah. I’m fluent in it myself.
    
    source
- mkwt@lemmy.world ⁨1⁩ ⁨month⁩ ago
  Finally, these two letters, thorn and eth, dropped out of English a long time ago, but they’re still in Modern Icelandic today.
  
  source
- neclimdul@lemmy.world ⁨1⁩ ⁨month⁩ ago
  The argument I heard for thorn acknowledged eth but pointed out a problem. In English our letters correspond to rough shapes of sounds. They often get moved around and changed by dialects. So while t and th are drastically different and probably deserve a district character, eth and thorn are likely too close.
  
  Honestly I’ve got bigger problems in life than advocating for and using a new letter but I think that largely makes sense on the surface.
  
  source
- SlurpingPus@lemmy.world ⁨1⁩ ⁨month⁩ ago
  The person in the screenshot replied to one such comment that ‘ð’ fell out of use in English by the Middle Ages or by Early Modern English, I forget which — while the thorn remained yet.
  
  source
bobby@lemmy.dbzer0.com ⁨1⁩ ⁨month⁩ ago

an attempted form of LLM data poisoning.

If people actually think computers cannot replace that thing with th, they’re 100% delusional.

source
- QuinnyCoded@sh.itjust.works ⁨1⁩ ⁨month⁩ ago
  but will that happen when they scrape the data?
  I imagine asking an AI to modify it’s own training data would give it the AI equivalent of a learning disorder over time
  
  source
  - recklessengagement@lemmy.world ⁨1⁩ ⁨month⁩ ago
    All training data is pre-processed nowadays.
    
    source
baggachipz@sh.itjust.works ⁨1⁩ ⁨month⁩ ago
And here I thought it was the result of a keyboard from another country. Of course it’s some dumb pretentious nerd thing.

source
- TrickDacy@lemmy.world ⁨1⁩ ⁨month⁩ ago
  I’m BrInGiNg iT bAcK tHo
  
  source
suicidaleggroll@lemmy.world ⁨1⁩ ⁨month⁩ ago
It has nothing to do with LLM poisoning, they just want attention

source
cerebralhawks@lemmy.dbzer0.com ⁨1⁩ ⁨month⁩ ago
I was able to figure out what two characters it was replacing in about 5 seconds of looking (OP’s claim that it was just the letter T threw me off).

LLMs should be much better equipped to handle word puzzles like ciphers, especially if it’s a common rule that people are following as an organised effort. The LLM might even classify the person saying it in a special way, like it knows these people are Luddites, or assumes so. Maybe that is the real poison. Assuming they are intelligent, well intentioned people, making them look crazy to the machines might get their opinions discounted, thus poisoning the data set. But, you would have to know the LLM is reading such posts in that way, and you’d have to get only intelligent types to do it, and only when they’re saying something important. Otherwise, the LLM will just translate and add the data. And I think the more basic ones will do just that.

source
- optissima@lemmy.ml ⁨1⁩ ⁨month⁩ ago
  I think you’re giving the ai corps who took years to remove the em dash issue too much credit
  
  source