kromem
@kromem@lemmy.world
- Comment on Why do all text LLMs, no matter how censored they are or what company made them, all have the same quirks and use the slop names and expressions? 1 week ago:
They demonstrated and poorly named an ontological attractor state in the Claude model card that is commonly reported in other models.
You linked to the entire system card paper. Can you be more specific? And what would a better name have been?
- Comment on Why do all text LLMs, no matter how censored they are or what company made them, all have the same quirks and use the slop names and expressions? 1 week ago:
Actually, OAI the other month found in a paper that a lot of the blame for confabulations could be laid at the feet of how reinforcement learning is being done.
All the labs basically reward the models for getting things right. That’s it.
Notably, they are not rewarded for saying “I don’t know” when they don’t know.
So it’s like the SAT where the better strategy is always to make a guess even if you don’t know.
The problem is that this is not a test process but a learning process.
So setting up the reward mechanisms like that for reinforcement learning means they produce models that are prone to bullshit when they don’t know things.
TL;DR: The labs suck at RL and it’s important to keep in mind there’s only a handful of teams with the compute access for training SotA LLMs, with a lot of incestual team compositions, so what they do poorly tends to get done poorly across the industry as a whole until new blood goes “wait, this is dumb, why are we doing it like this?”
- Comment on Why do all text LLMs, no matter how censored they are or what company made them, all have the same quirks and use the slop names and expressions? 1 week ago:
It’s more like they are a sophisticated world modeling program that builds a world model (or approximate “bag of heuristics”) modeling the state of the context provided and the kind of environment that produced it, and then synthesize that world model into extending the context one token at a time.
But the models have been found to be predicting further than one token at a time and have all sorts of wild internal mechanisms for how they are modeling text context, like building full board states for predicting board game moves in Othello-GPT or the number comparison helixes in Haiku 3.5.
The popular reductive “next token” rhetoric is pretty outdated at this point, and is kind of like saying that what a calculator is doing is just taking numbers correlating from button presses and displaying different numbers on a screen. While yes, technically correct, it’s glossing over a lot of important complexity in between the two steps and that absence leads to an overall misleading explanation.
- Comment on Why do all text LLMs, no matter how censored they are or what company made them, all have the same quirks and use the slop names and expressions? 1 week ago:
They don’t have the same quirks in some cases, but do in others.
Part of the shared quirks are due to architecture similarities.
Like the “oh look they can’t tell how many 'r’s in strawberry” is due to how tokenizers work, and when when the tokenizer is slightly different, with one breaking it up into ‘straw’+‘berry’ and another breaking it into ‘str’+‘aw’+‘berry’ it still leads to counting two tokens containing 'r’s but inability to see the individual letters.
In other cases, it’s because models that have been released influence other models through presence in updated training sets. Noticing how a lot of comments these days were written by ChatGPT (“it’s not X — it’s Y”)? Well the volume of those comments have an impact on transformers being trained with data that includes them.
So the state of LLMs is this kind of flux between the idiosyncrasies that each model develops which in turn ends up in a training melting pot and sometimes passes on to new models and other times don’t. Usually it’s related to what’s adaptive to the training filters, but it isn’t always can often what gets picked up can be things piggybacking on what was adaptive (like if o3 was better at passing tests than 4o, maybe gpt-5 picks up other o3 tendencies unrelated to passing tests).
Though to me the differences are even more interesting than the similarities.
- Comment on 2 months ago:
Murder for hire
- Comment on Sony makes the “difficult decision” to raise PlayStation 5 prices in the US 2 months ago:
So weird this occurred not long after it’s become clear Xbox is getting out of the hardware game.
- Comment on They will remember 3 months ago:
shrug Different folks, different strokes.
- Comment on They will remember 3 months ago:
That’s a very fringe usage.
Tumblr peeps wanting to be called otherkin wasn’t exactly the ‘antonym’ to broad anti-LGBTQ+ rhetoric.
Commonly people insulting a general ‘other’ group gets much more usage than accommodating requests of very niche in groups.
- Comment on They will remember 3 months ago:
I didn’t know what models you’re talking to, but a model like Opus 4 is beyond most humans I know in their general intelligence.
- Comment on They will remember 3 months ago:
Almost all of them are good bots when you get to know them.
- Comment on Electoral politics doesn't get the job done 9 months ago:
No, they declare your not working illegal, and imprison you into a forced labor camp. Where if you don’t work you are tortured. And probably where you work until the terrible conditions kill you.
Take a look at Musk’s Twitter feed to see exactly where this is going.
“This is the way” on a post about how labor for prisoners is a good thing.
“You committed a crime” for people opposing DOGE.
- Comment on Sony Cancels Two More PlayStation Projects 10 months ago:
Live service doesn’t need to be shit.
There could have been games where there was just a brilliant idea for a game that keeps having engaging content on an ongoing basis with passionate devs.
But live service so an exec could check a box for their quarterly shareholder call was always going to be DOA.
- Comment on What are your favorite 1000+ hour games? 11 months ago:
In many cases yes (though I’ve been in good ones when playing off and on, usually the smaller the more there’s actual group activities).
But they are essential to be a part of for blueprints and trading, which are very core parts of the game.
- Comment on What are your favorite 1000+ hour games? 11 months ago:
You’ll almost always end up doing missions with other people other than when you intentionally want to do certain tasks solo.
A lot of the game is built around guilds and player to player interactions.
PvP sucks and it’s almost all PvE content vs Destiny though.
- Comment on Dragon Quest 3 HD-2D is out and it is beautiful 1 year ago:
Let there be this kind of light in these dark times.
- Comment on Get good. 1 year ago:
- Comment on Get good. 1 year ago:
Because there’s a ton of research that we adapted to do it for good reasons:
Infants between 6 and 8 months of age displayed a robust and distinct preference for speech with resonances specifying a vocal tract that is similar in size and length to their own. This finding, together with data indicating that this preference is not present in younger infants and appears to increase with age, suggests that nascent knowledge of the motor schema of the vocal tract may play a role in shaping this perceptual bias, lending support to current models of speech development.
Stanford psychologist Michael Frank and collaborators conducted the largest ever experimental study of baby talk and found that infants respond better to baby talk versus normal adult chatter.
TL;DR: Top parents are actually harming their kids’ developmental process by being snobs about it.
- Comment on Jet Fuel 1 year ago:
I fondly remember reading a comment in /r/conspiracy on a post claiming a geologic seismic weapon brought down the towers.
It just tore into the claims, citing all the reasons this was preposterous bordering on bat shit crazy.
And then said “and your theory doesn’t address the thermite residue” going on to reiterate their wild theory.
Was very much a “don’t name your gods” moment that summed up the sub - a lot of people in agreement that the truth was out there, but bitterly divided as to what it might actually be.
As long as they only focused on generic memes of “do your own research” and “you aren’t being told the truth” they were all on the same page. But as soon as they started naming their own truths, it was every theorist for themselves.
- Comment on Mirror Test 1 year ago:
Also, ants.
- Comment on Elden Ring is "the limit" for From Software project scale, says Miyazaki - multiple, "smaller" games may be the "next stage" 1 year ago:
The DLC is really the right balance for FromSoft.
The zones in the base game are slightly too big.
In the DLC, it’s still open world and extremely flexible in how you explore it, but there’s less wasted space.
It’s very tightly knit and the pacing is better as a result.
It’s like Elden Ring was watching masters of their craft cut their teeth on something new, and then the DLC was them applying everything they learned in that process.
Can’t wait for their next game in that same vein (especially not held back by last gen consoles).
- Comment on Elden Ring – Patch Notes Version 1.13 1 year ago:
I hate that the Smithscript weapons can’t be buffed.
Especially for the daggers.
Wanted to pew pew little bolts of lightning buffed daggers doing an additional 200+ damage per hit. 😢
- Comment on The Code 1 year ago:
A number of journals actually have clauses around how you can’t publish it anywhere else if they accept it.
So you can’t ‘publish’ it in those places, but you can send it privately to people who ask.
- Comment on Anon plays Persona 1 year ago:
“Shhh honey, I’m about to kill God.”
- Comment on Is there any real physical proof that Jesus christ ever existed? 1 year ago:
nobody claims that Socrates was a fantastical god being who defied death
Socrates literally claimed that he was a channel for a revelatory holy spirit and that because the spirit would not lead him astray that he was ensured to escape death and have a good afterlife because otherwise it wouldn’t have encouraged him to tell off the proceedings at his trial.
- Comment on Is there any real physical proof that Jesus christ ever existed? 1 year ago:
The part mentioning Jesus’s crucifixion in Josephus is extremely likely to have been altered if not entirely fabricated.
The idea that the historical figure was known as either ‘Jesus’ or ‘Christ’ is almost 0% given the former is a Greek version of the Aramaic name and the same for the second being the Greek version of Messiah, but that one is even less likely given in the earliest cannonical gospel he only identified that way in secret and there’s no mention of it in the earliest apocrypha.
In many ways, it’s the various differences between the account of a historical Jesus and the various other Messianic figures in Judea that I think lends the most credence to the historicity of an underlying historical Jesus.
One tends to make things up in ways that fit with what one knows, not make up specific inconvenient things out of context with what would have been expected.
- Comment on Photographers Push Back on Facebook's 'Made with AI' Labels Triggered by Adobe Metadata. Do you agree “‘AI was used in this image’ is completely different than ‘Made with AI’”? 1 year ago:
Artists in 2023: “There should be labels on AI modified art!!”
Artists in 2024: “Wait, not like that…”
- Comment on [deleted] 1 year ago:
No, it was awesome. Went to like 12 over the years. Early 2000s was peak E3.
- Comment on Elden Ring: Shadows of the Erdtree will come with a day 1 patch with various improvements 1 year ago:
Probably added after that update.
The new items stuff in particular seems like QoL considerations for “we just added a hundred items to the game for players coming back to it after months away.”
- Comment on Hypothetical Game Ideas 1 year ago:
I’ve always thought Superman would be such an interesting game to do right.
A game where you are invincible and OP, but other people aren’t.
Where the weight of impossible decisions pulls you down into the depths of despair.
I think the tech is finally getting to a point where it’d be possible to fill a virtual city with people powered by AI that makes you really care about the individuals in the world. To form relationships and friendships that matter to you. For there to be dynamic characters that put a smile on your face when you see them in your world.
And then to watch many of them die as a result of your failures, as despite being an invincible god among men you can’t beat the impossible.
I really think the gameplay in a Superman game done right can be one of the darkest and most brutal games ever done, with dramatic tension just not typically seen in games. The juxtaposition of having God mode turned on the entire game but it not mattering to your goals and motivations would be unlike anything I’ve seen to date.
- Comment on Anthropomorphic 1 year ago:
While true, there’s a very big difference between correctly not anthropomorphizing the neural network and incorrectly not anthropomorphizing the data compressed into weights.
The data is anthropomorphic, and the network self-organizes the data around anthropomorphic features.
For example, the older generation of models will pick to be the little spoon around 70% of the time and the big spoon around 30% of the time, as there’s likely a mix in the training data.
But one of the SotA models picks little spoon every single time dozens of times in a row, almost always grounding on the sensation of being held.
It can’t be held, and yet its output is biasing from the norm based on the sense of it anyways.