Mniot
@Mniot@programming.dev
Mniot Mniot Mniot Mniot Mniot Mniot Mniot Mniot Mniot Mniot Mniot Mniot Mniot Mniot Mniot Mniot Mniot Mniot Mniot Mniot Mniot Mniot Mniot Mniot Mniot Mniot Mniot Mniot Mniot Mniot Mniot Mniot Mniot Mniot Mniot Mniot
- Comment on ELI5. Limit of current gen AI/LLMs 1 week ago:
The “agents” and “agentic” stuff works by wrapping the core innovation (the LLM) in layers of simple code and other LLMs. Let’s try to imaging building a system that can handle a request like “find where I can buy a video card today. Make a table of the sites, the available cards, their prices, and how they compare on a benchmark.” We could solve this if we had some code like
search_prompt = llm(f"make a list of google web search terms that will help answer this user's question. present the result in a json list with one item per search. <request>{user_prompt}</request>") results_index = [] for s in json.parse(search_prompt): results_index.extend(google_search(s)) results = [fetch_url(url) for url in results_index] summarized_results = [llm(f"summarize this webpage, fetching info on card prices and benchmark comparisons <page>{r}</page>") for r in results] return llm(f"answer the user's original prompt using the following context: <context>{summarized_results}</context> <request>{user_prompt}</request>")
It’s pretty simple code, and LLMs can write that, so we can even have our LLM write the code that will tell the system what to do! (I’ve omitted all the work to try to make things sane in terms of sandboxing and dealing with output from the various internal LLMs).
The important thing we’ve done here is instead of one LLM that gets too much context and stops working well, we’re making a bunch of discrete LLM calls where each one has a limited context. That’s the innovation of all the “agent” stuff. There’s an old Computer Science truism that any problem can be solved by adding another layer of indirection and this is yet another instance of that.
Trying to define a “limit” for this is not something I have a good grasp on. I guess I’d say that the limit here is the same: max tokens in the context. It’s just that we can use sub-tasks to help manage context, because everything that happens inside a sub-task doesn’t impact the calling context. To trivialize things: imagine that the max context is 1 paragraph. We could try to summarize my post by summarizing each paragraph into one sentence and then summarizing the paragraph made out of those sentences. It won’t be as good as if we could stick everything into the context, but it will be much better than if we tried to stick the whole post into a window that was too small and truncated it.
Some tasks will work impressively well with this framework: web pages tend to be a TON of tokens but maybe we’re looking for very limited info in that stack, so spawning a sub-LLM to find the needle and bring it back is extremely effective. OTOH tasks that actually need a ton of context (maybe writing a book/movie/play) will perform poorly because the sub-agent for chapter 1 may describe a loaded gun but not include it in its output summary for the next agent. (But maybe there are more ways of slicing up the task that would allow this to work.)
- Comment on Incel propaganda in my music app 2 weeks ago:
As with dating apps, if the mental health recommendations actually worked then engagement would go down. I doubt they even need a human putting their thumb on the scales; the algorithm optimizes engagement in whatever way it needs to.
- Comment on The developers of PEAK, explaining how they decided on pricing for their game. 4 weeks ago:
It works against the general population, if this particular one doesn’t, don’t get too busy strutting, there is almost certainly something else that does work on you.
That is very well put! I feel like I’ve talked to so many people who see one ad that doesn’t land and say, “ads don’t work on me.”
- Comment on Anon wants to talk about video games 1 month ago:
Hell yeah! When you click over from “this game is impossible wth” to kicking Rodney’s ass so much that you’d expect him to just give up…
Nice work cracking a tough game!
- Comment on Don't like the 'left liberal bias' of cited and sourced Wikipedia articles? Not a problem, our lord and savior Elon is introducing Grokipedia. 3 months ago:
Conservapedia was comically stupid. Like, there was a lengthy diatribe against the Theory of Relativity that seemed largely based off Andrew Schlafly confusing the Physics term “relativity” with “moral relativity” and being against the latter. This was especially weird because Schlafly personally had a background in applied physics and so ought to know that GPS satellites serve as a proof of some of Relativity.
As embarrassing as basically every page of Conservapedia was, at least it represented some stupid man’s beliefs and effort. Grokipedia can’t even do that.
- Comment on Anon gets his life in order 3 months ago:
Yeesh. What’s the girlfriend getting out of all of this? Seems like a lot of work to run someone else’s life in addition to your own.
- Comment on You fool 4 months ago:
Though I’ve now had someone check Google and try to tell me that they’re right because of the AI summary agreeing with them.
- Comment on Posting for the "Now guys he was MURDERED! Don't celebrate!" Crowd 5 months ago:
The people cheering are just honoring Charlie’s legacy.
- Comment on Is it realistic to hope that lemmy grows to the size of the bigger social media platforms? 6 months ago:
This, but in a hopeful voice instead of sarcastic 🙂
(Being surrounded by people who think more progressively will tend to shift people’s views)
- Comment on Too bad we can't have good public transportation 7 months ago:
Capitalism is when there’s an owner-class controlling production via capital. It doesn’t really matter what they’re producing or at what cost or who’s consuming.
- Comment on Too bad we can't have good public transportation 7 months ago:
No, because cross-country trains and heavy use of them to move goods and people predates cars by quite a bit. Trains were a key component of the North winning the Civil War, for example.
Lots of existing train infrastructure needed to be torn out to make room for car infrastructure.
- Comment on Too bad we can't have good public transportation 7 months ago:
Are you suggesting that’s why the US hasn’t improved trains? Is there something about train improvements specifically that you think is harmful?
- Comment on You don’t see articles like this about moms with three two jobs who still manage to take care of their kids. 7 months ago:
Working hard, or jerking hard?
- Comment on hubris go brrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr 7 months ago:
Lots of things seem reasonable if you skip the context and critical reasoning. It’s good to keep some past examples of this that personally bother you in your back pocket. Then you have it as an antidote for examples that don’t bother you.
- Comment on Jack Dorsey says his 'secure' new Bitchat app has not been tested for security 8 months ago:
Looking at the code, it reads like it was written by LLM: chatty commit messages, lack of spelling/capitalization errors, bullet points galore, shit-ton of “Fix X” commits that don’t read like they’re increasingly-frustrated, worthless comments randomly scattered like “i + 1 // add 1 to i” without any other comments on the page.
No security review because none of the code has been reviewed and he doesn’t know what’s in it.
- Comment on Kid gave a reasonable answer without all the math bullshit 9 months ago:
The title of this post is disappointing. The given answer is sound and it seems safe to assume it was arrived at by thinking mathematically.
- Comment on Rhubarb 9 months ago:
Some people think, “oh this witch leaving a note means she’s really powerless and I can keep taking the rhubarb.” It’s not going to be so awesome when she forecloses on his first-born.
- Comment on Bluesky Is Plotting a Total Takeover of the Social Internet 9 months ago:
Commercial software has advertising: people whose job is to advertise it. That means TV and web ads for Bluesky, influencers talking about it. It also means a team of software engineers building parts of the system specifically to draw people in, whereas non-commercial software often rejects that (lack of infinite-scroll on Lemmy’s default UI, for example).
Activity Pub also requires a different mind-set that doesn’t exist elsewhere on the internet today. You need to decide which instance to join, or maybe to host your own instance. But it doesn’t really matter, because you can federate with other instances. But you have to drive some of that federation, so it does matter a little. It’s pretty complex and confusing and its a problem that only exists in this one niche of software.
Bluesky gives you an infinite feed that feels like you’re connected to the entire Internet without you doing any work. I think the AP service are doing really well, considering what they’re up against.
- Comment on People Are Losing Loved Ones to AI-Fueled Spiritual Fantasies 10 months ago:
Based on the article, it seems like cult-follower behavior. Not everyone is susceptible to cults (I think it’s a combo of individual brain and life-circumstances), but I wouldn’t say, “eh, it’s not the cult’s fault that these delusional people killed themselves!”
- Comment on DeepSeek-V3 now runs at 20 tokens per second on Mac Studio, and that’s a nightmare for OpenAI 11 months ago:
Not an answer to your question, but I thought this was a nice article for getting some basic grounding on the new AI stuff: arstechnica.com/…/a-jargon-free-explanation-of-ho…
- Comment on Anon gives a former president some feedback 11 months ago:
I’m holding out hope that we can still turn it around and defeat Santa Claus