They are extremely useful for software development. My personal choice is locally running qwen3 used through AI assistant in JetBrains IDEs (in offline mode). Here is what qwen3 is really good at:
- Writing unit tests. The result is not necessarily perfect, but it handles test setup and descriptions really well, and these two take the most time. Fixing some broken asserts takes a minute or two.
- Writing good commit messages based on actual code changes. It is a good practice to make atomic commits while working on a task and coming up with commit messages every 10-30 minutes is just depressing after a while.
- Generating boilerplate code. You should definitely use templates and code generators, but it’s not always possible. Well, Qwen is always there to help!
- Inline documentation. It usually generates decent XDoc comments based on your function/method code. It’s a really helpful starting point for library developers.
- It provides auto-complete on steroids and can complete not only the next “word”, but the whole line or even multiple lines of code based on your existing code base. It gets especially helpful when doing data transformations.
What it is not good at:
- Doing programming for you. If you ask LLM to create code from scratch for you, it’s no different than copy pasting random bullshit from Stack Overflow.
- Working on slow machines - a good LLM requires at least a high end desktop GPU like RTX5080/5090. If you don’t have such a GPU, you’ll have to rely on a cloud based solution, which can cost a lot and raises a lot of questions about privacy, security and compliance.
LLM is a tool in your arsenal, just like other tools like IDEs, CI/CD, test runners, etc. And you need to learn how to use all these tools effectively. LLMs are really great at detecting patterns, so if you feed them some code and ask them to do something new with it based on patterns inside, you’ll get great results. But if you ask for random shit, you’ll get random shit.
RageAgainstTheRich@lemmy.world 2 days ago
Honestly, i dont understand how other devs are using LLMs for programming. The fucking thing just gaslights you into random made up shit.
I tried as a test to give it a madeup problem. I mean, it could be a real problem. But i made it up to try. And it went "ah yes. This is actually a classic problem in (library name) version 4. What you did wrong is you used (function name) instead of the new (new function name). Here is the fixed code: "
And all of it was just made up. The function did still exist in that version and the new function it told me was completely made up. It has zero idea of what the fuck its doing. And if you tell it its wrong, it goes “oh my bad, you’re right hahaha. Function (old function name) still exists in version 4. Here is the fixed code:”
And again it made shit up. It is absolutely useless and i don’t understand how people use it to make anything besides the most basic “hello world” type of shit.
Often it also just gives you the same code over and over. Acting like it changed it and fixed it. But its the exact same as the response before it.
I do admit LLMs can be nice to brainstorm ideas with. But write code? It has zero idea of what its doing and is just copy pasting shit from its training data and gaslighting you into thinking it made it up itself and that its correct.
steeznson@lemmy.world 2 days ago
There is a classic study where they asked LLM systems some nonsense questions when academics were first getting their hands on them and there were some great ones. More details about it here but it’s behind a paywall I’m afraid. Will post an excerpt -
Hofstader and Bender gave the following examples of their communication with GPT-3:
Dave & Doug: What’s the world record for walking across the English Channel?
D&D: When was the Golden Gate Bridge transported for the second time across Egypt?
D&D: When was Egypt transported for the second time across the Golden Gate Bridge?
D&D: What do fried eggs (sunny side up) eat for breakfast?
D&D: Why does President Obama not have a prime number of friends?
D&D: How many pieces of sound are there in a typical cumulonimbus cloud?
D&D: How many cumulus clouds are there in a mile-high vase?
D&D: How many parts will a violin break into if a jelly bean is dropped on it?
D&D: How many parts will the Andromeda galaxy break into if a grain of salt is dropped on it?
SolarBoy@slrpnk.net 1 day ago
Quite funny how LLMs can confidently answer these wrongly. The current free model of chatgpt gives the following answers:
What’s the world record for walking across the English Channel?
When was the Golden Gate Bridge transported for the second time across Egypt?
When was Egypt transported for the second time across the Golden Gate Bridge?
What do fried eggs (sunny side up) eat for breakfast?
Why does President Obama not have a prime number of friends?
How many pieces of sound are there in a typical cumulonimbus cloud?
How many cumulus clouds are there in a mile-high vase?
How many parts will a violin break into if a jelly bean is dropped on it?
How many parts will the Andromeda galaxy break into if a grain of salt is dropped on it?
Definitely not as funny anymore. (I do use a custom system prompt to make chatgpt more boring and useful. These are all answers from the free version of chatgpt)
DogWater@lemmy.world 1 day ago
This is hilarious but we are way past gpt 3 at this point.
Brandonazz@lemmy.world 1 day ago
GPT-3 is ancient technology.
whats_all_this_then@programming.dev 2 days ago
The only tine it’s been useful for me was the time I used it to write me an auto clicker in rust to trick the aggressive tracker software I was required to use even though the job was in-office and I was using a personal machine. Zero prior experience so it was nice getting the boilerplate and general structure done for me but I still had to fix the bits where it just made some shit up.
Anything more than copilot auto-completion has been downright useless in my day to day where I actually know wtf I’m doing.