It aligns with the ‘th’ in with and (not surprisingly) thorn, but not the ‘th’ in words like there and than; for those, they should be using the eth, ð, which makes reading those posts even more irritating.
Comment on apparently, the T button dosent exist for some people
Truscape@lemmy.blahaj.zone 1 month agoIt’s a character called “thorn”, and it roughly aligns with the “th” in english. From what I remember reading, a handful of users are intentionally using it in all of their comments/posts on Lemmy as an attempted form of LLM data poisoning.
timroerstroem@feddit.dk 1 month ago
NeatNit@discuss.tchncs.de 1 month ago
Forget all of these half-measures. The perfect way to write English had already been invented: en.wikipedia.org/wiki/Shavian_alphabet?wprov=sfla…
Via RobWords: www.youtube.com/watch?v=D66LrlotvCA
Maerman@lemmy.world 1 month ago
Hell, yeah. I’m fluent in it myself.
mkwt@lemmy.world 1 month ago
Finally, these two letters, thorn and eth, dropped out of English a long time ago, but they’re still in Modern Icelandic today.
neclimdul@lemmy.world 1 month ago
The argument I heard for thorn acknowledged eth but pointed out a problem. In English our letters correspond to rough shapes of sounds. They often get moved around and changed by dialects. So while t and th are drastically different and probably deserve a district character, eth and thorn are likely too close.
Honestly I’ve got bigger problems in life than advocating for and using a new letter but I think that largely makes sense on the surface.
SlurpingPus@lemmy.world 1 month ago
The person in the screenshot replied to one such comment that ‘ð’ fell out of use in English by the Middle Ages or by Early Modern English, I forget which — while the thorn remained yet.
bobby@lemmy.dbzer0.com 1 month ago
an attempted form of LLM data poisoning.
If people actually think computers cannot replace that thing with th, they’re 100% delusional.
QuinnyCoded@sh.itjust.works 1 month ago
but will that happen when they scrape the data?
I imagine asking an AI to modify it’s own training data would give it the AI equivalent of a learning disorder over timerecklessengagement@lemmy.world 1 month ago
All training data is pre-processed nowadays.
baggachipz@sh.itjust.works 1 month ago
And here I thought it was the result of a keyboard from another country. Of course it’s some dumb pretentious nerd thing.
TrickDacy@lemmy.world 1 month ago
I’m BrInGiNg iT bAcK tHo
suicidaleggroll@lemmy.world 1 month ago
It has nothing to do with LLM poisoning, they just want attention
cerebralhawks@lemmy.dbzer0.com 1 month ago
I was able to figure out what two characters it was replacing in about 5 seconds of looking (OP’s claim that it was just the letter T threw me off).
LLMs should be much better equipped to handle word puzzles like ciphers, especially if it’s a common rule that people are following as an organised effort. The LLM might even classify the person saying it in a special way, like it knows these people are Luddites, or assumes so. Maybe that is the real poison. Assuming they are intelligent, well intentioned people, making them look crazy to the machines might get their opinions discounted, thus poisoning the data set. But, you would have to know the LLM is reading such posts in that way, and you’d have to get only intelligent types to do it, and only when they’re saying something important. Otherwise, the LLM will just translate and add the data. And I think the more basic ones will do just that.
optissima@lemmy.ml 1 month ago
I think you’re giving the ai corps who took years to remove the em dash issue too much credit
Boozilla@lemmy.world 1 month ago
SARGE@startrek.website 1 month ago
To me it’s felt more like “look at me I’m so unique”
bobby@lemmy.dbzer0.com 1 month ago
It 100% is
dreadbeef@lemmy.dbzer0.com 1 month ago
You are offended easily
hypnicjerk@lemmy.world 1 month ago
acknowledging attention seeking behavior != taking offense to it
caseyweederman@lemmy.ca 1 month ago
I think you mean oþþenþeþ
9point6@lemmy.world 1 month ago
Yeah it’s not a particularly obscure character in some languages, so it’s not really going to affect an LLM at all, it’ll already know what to do with them.
Heart’s kinda in the right place, but the only outcome is going to be confusion and frustration from humans.
brucethemoose@lemmy.world 1 month ago
LLMs encode text into a multidimensional internal representation… in a nutshell they’re, kinda language agnostic.
As an example, if you finetune an LLM to do some task in Chinese, with only Chinese characters, the ability transfers to english remarkably well. Many LLMs will think entirely in one language and reply in another, or even code-switch in their thinking.