Comment on Lutris now being built with Claude AI, developer decides to hide it after backlash
Vlyn@lemmy.zip 2 days agoYou might genuinely be using it wrong.
At work we have a big push to use Claude, but as a tool and not a developer replacement. And it’s working pretty damn well when properly setup.
Mostly using Claude Sonnet 4.6 with Claude Code. It’s important to run /init and check the output, that will produce a CLAUDE.md file that describes your project (which always gets added to your context).
Important: Review everything the AI writes, this is not a hands-off process. For bigger changes use the planning mode and split tasks up, the smaller the task the better the output.
Claude Code automatically uses subagents to fetch information, e.g. API documentation. Nowadays it’s extremely rare that it hallucinates something that doesn’t exist. It might use outdated info and need a nudge, like after the recent upgrade to .NET 10 (But just adding that info to the project context file is enough).
p03locke@lemmy.dbzer0.com 2 days ago
Agreed, I don’t understand people not even giving it a chance. They try it for five minutes, it doesn’t do exactly what they want, they give up on it, and shout how shit it is.
Meanwhile, I put the work in, see it do amazing shit after figuring out the basics of how the tech works, write rules and skills for it, have it figure out complex problems, etc.
It’s like handing your 90-year-old grandpa the Internet, and they don’t know what the fuck to do with it. It’s so infuriating.
Zos_Kia@jlai.lu 1 day ago
Just yesterday I had one of those moments of grace that are becoming commonplace.
Basically I have to migrate a service from a n8n workflow to an actual nodejs server for performance reasons. I spent 15 minutes carefully scoping the migration, telling it exactly what tools to use and code style to adopt. Gave it the original brief and access to the n8n workflows.
The whole thing was done in 4 minutes and 30 seconds. It even noticed a bug which has been in production unnoticed for the past year. Gave me some good documentation on how to setup the Google service account, the kind of memory usage to expect so I can dimension the instant accordingly. Another five minutes and I had a whole test suite with decent coverage. I had negotiated with the client that it would take around a week, well that was the under promise of the year…
People who go around telling it doesn’t work are incompetent, out of their minds or straight up lying.
Vlyn@lemmy.zip 2 days ago
It’s not really that simple. Yes, it’s a great tool when it works, but in the end it boils down to being a text prediction machine.
So a nice helper to throw shit at, but I trust the output as much as a random Stackoverflow reply with no votes :)
p03locke@lemmy.dbzer0.com 2 days ago
And we’re barely smarter than a bunch of monkeys throw piles of shit at each other. Being reductive about its origins doesn’t really explain anything.
Yeah, but that’s why there’s unit tests. Let it run its own tests and solve its own bugs. How many mistakes have you or I made because we hate making unit tests? At least the LLM has no problems writing the tests, after you know it works.
svtdragon@lemmy.world 2 days ago
I’ve had better luck with using it in a TDD style. “Write a test for this issue, watch it fail, then make it pass.”
dream_weasel@sh.itjust.works 2 days ago
I feel like there needs to be a post (and I don’t want to write it, but maybe I eventually will) that outlines what a model really is. It is not just a statistical text prediction machine unless you are being so loose with the definition of “statistical” that it doesn’t even mean anything anymore.
A decent example of a statistical text prediction machine is the middle word suggested by your phone when you’re using the keyboard. An LLM is not that.
In the most general terms, this kind of language model tokenizes a corpus of text based on a vocabulary (which is probably more than just the words in the dictionary), uses an embedding model to translate these tokens into a vector of semantic “meaning” which minimized loss in a bidirectional encoding (probably), that is then trained against a rubric for one or more topic area questions, retrained for instruction and explainability, retrained with reinforcement learning and human feedback to provide guardrails, and retrained again to make use of supplemental materials not part of the original training corpus (resource augmented generation), then distilled, then probably scaled and fine tuned against topic areas of choice (like coding or Korean or whatever) and maybe THEN made available to people to use. There are generally more parts to curriculum learning even than that but it’s a representative-ish start.
My point being that, yes, it would be nuts to pose ANY question to a predictor that says “with 84% probability, the word that is most likely follows ‘I really like’ is ‘gooning’ on reddit”, but even Grok is wildly more sophisticated than that and Grok is terrible.
Vlyn@lemmy.zip 2 days ago
The training is sophisticated, but inference is unfortunately really a text prediction machine. Technically token prediction, but you get the idea.
For every single token/word. You input your system prompt, context, user input, then the output starts.
The
Feed the entire context back in and add the reply “The” at the end.
The capital
Feed everything in again with “The capital”
The capital of
Feed everything in again…
The capital of Austria
…
It literally works like that, which sounds crazy :)
The only control you as a user can have is the sampling, like temperature, top-k and so on. But that’s just to soften and randomize how deterministic the model is.
moseschrute@lemmy.world 2 days ago
Most people on Lemmy probably haven’t given it a single minute let alone 5 minutes.