Hawk
@Hawk@lemmynsfw.com
- Comment on How does this pic show that Elon Musk doesnt know SQL? 6 days ago:
I’ve had a poor experience with btrfs dedupe tbh (and a terrible experience with qgroups), however, this was years ago. Btrfs snapshots I prefer though, much easier not to have that dependence.
What distro are you using for ZFS, void?
- Comment on How does this pic show that Elon Musk doesnt know SQL? 6 days ago:
Fair point, I’ve edited the answer to be clearer for future readers.
- Comment on How does this pic show that Elon Musk doesnt know SQL? 6 days ago:
Well Ive ad a great time using LLMs to sandbox a dozen implementations and then investigate the shortcoming and advantages of different implementations.
Mistakes happen a lot but they can be managed on a small MWE with a couple of tests.
It’s how the tool is used more than any given tool being bad.
I understand your point and you’re not wrong. However, I’m not wrong either and you should take a second look at how you might use these tools in a way that makes your life easier and addresses the valid limitations you’ve described.
- Comment on How does this pic show that Elon Musk doesnt know SQL? 1 week ago:
I disagree, it’s just a tool. It’s a fantastic way to template applications very quickly, particularly for those who are not already familiar with technologies and may not have the time or opportunity to play around with things otherwise.
Llm is not a search engine and it can produce awful code. This is not production code, it’s for tinkering. As a sandbox tool, LLMs are fantastic.
On the ethical side of things, yeah openAI sucks, Qwen2.5 would be up to this task, one can run that locally.
- Comment on How does this pic show that Elon Musk doesnt know SQL? 1 week ago:
Its because the contents he made are inconsistent with common conventions in data engineering.
- It is very common not to deduplicate data and instead just append rows, The current value is the most recent and all the old ones are simply historical. That way you don’t risk losing data and you have an entire history.
- whilst you could do some trickery to deduplicate the data it does create more complexity. There’s an old saying with ZFS: “Friends don’t let friends dedupe” And it’s much the same here.
- compression is usually good enough. It will catch duplicated data and deal with it in a fairly efficient way, not as efficient as deduplication but it’s probably fine and it’s definitely a lot simpler
- Claiming the government does not use SQL
- It’s possible they have rolled their own solution or they are using MongoDB Or something but this would be unlikely and wouldn’t really refute the initial claim
- I believe many other commenters noted that it probably is MySQL anyway.
Basically what he said is incoherent to anybody who has worked with larger data.
In terms of using SQL, it’s basically just a more reliable and better Excel that doesn’t come with a default GUI.
If you need to store data, It’s almost always best throw it into a SQLite database Because it keeps it structured. It’s standardised and it can be used from any programming language.
However, many people use excel because they don’t have experience with programming languages.
Get chatGpt to help you write a PyQT GUI for a SQLite database and I think you would develop a high level understanding for how the pieces fit together
- It is very common not to deduplicate data and instead just append rows, The current value is the most recent and all the old ones are simply historical. That way you don’t risk losing data and you have an entire history.
- Comment on AI Traning 3 weeks ago:
An LLM is an equation, fundamentally. Map a word to a number, equation, map back to words and now llm. If you’re curious write a name generator using torch with an rnn (plenty of tutorials online) and you’ll have a good idea.
The parameters of the equation are referred to as weights. They release the weights but may not have released:
- source code for training
- there source code for inference / validation
- training data
- cleaning scripts
- logs, git history, development notes etc.
Open source is typically more concerned with the open nature of the code base to foster community engagement and less on the price of the resulting software.
Curiously, open weighted LLM development has somewhat flipped this on its head. Where the resulting software is freely accessible and distributed, but the source code and material is less accessible.
- Comment on AI Traning 3 weeks ago:
The energy use isn’t that extreme. A forward pass on a 7B can be achieved on a Mac book.
If it’s code and you RAG over some docs you could probably get away with a 4B tbh.
ML models use more energy than a simple model, however, not that much more.
The reason large companies are using so much energy is that they are using absolutely massive models to do everything so they can market a product. If individuals used the right model to solve the right problem (size, training, feed it with context etc. ) there would be no real issue.
It’s important we don’t conflate the excellent progress we’ve made with transformers over the last decade with an unregulated market, bad company practices and limited consumer Tech literacy.
TL;DR: LLM != search engine
- Comment on [deleted] 4 weeks ago:
I’ve used this app before and I really like it and I’d recommend it to everybody in this thread:
- Comment on [deleted] 4 weeks ago:
Track your macros, get some WPC and make peanut butter+banana+WPC milkshakes.
Savoury dishes will be limited to noodles – from experience.
- Comment on If AI spits out stuff it's been trained on 4 weeks ago:
It doesn’t need CSAM in the dataset to generate images that would be considered CSAM.
I’m sure they take good effort to stay away from that stuff as it’s bad for business.
- Comment on Hope you had a great christmas 1 month ago:
Probably exports. Small population + large number of exports, it’s possible the per capita scaling skews this.
- Comment on How am I supposed to obtain income? 2 months ago:
Wife and I have been unemployed for nearly a year. We’re in a white collar recession so it’s gonna be brutal for a little while. Not much you can do really, it’s really hard right now.
Labouring / trades seem like the ticket tbh.
- Comment on the lifestyle 3 months ago:
It’s a lot more like Seaborn. It produces gorgeous plots with a lovely syntax that is quick and easy to use, but it’s not a full drawing toolkit like matplotlib.
If I need the plot to have a very precise aesthetic, mpl is great. But if I want a high quality statistical plot that looks great. ggplot2 will do it in about 2 seconds.
I have no idea how op thinks they could make a decent histogram any quicker than
ggplot(data) + geom_histogram(x= x)
. I mean you don’t even have to leave your shell/editor or extract the SQL into CSV. - Comment on What can I do with US$10K that is a good investment? 4 months ago:
This. Alternatively, 401K depending on a variety of factors.
- Comment on How do I Graphene OS? 5 months ago:
To elaborate on this answer, I found the performance with graphene to be really subpar.
I know it’s supposed to be exactly the same in terms of speed, but that was not my experience.
If You find the performance of your device to be inadequate now you may wish to upgrade first.
- Comment on Every show with a suicide now has a disclaimer with a suicide hotline at the beginning. Is there any evidence that these warnings make a positive difference? 5 months ago:
I would like to think that these hotlines are helpful.
I have experience with somebody calling a sexual abuse hotline and being told to " Work less and go outside tomorrow".
This was a crisis situation and the advice was woefully inadequate and unhelpful.
Overall, I’m sure access to a hotline that is monitored with people who are experts at dealing with these situations is a good thing. I doubt they’re funded very well though.
- Comment on Is linux actually gaming ready or is it just not for me? 5 months ago:
If a game doesn’t run on Linux, I just don’t play it.
Life is too short, I don’t care anymore.
I need Linux for work and I have no interest in paying for an OS that doesn’t let me have privacy.
So fuck it, if companies don’t write there software well enough… I’ll live.
I’d rather spend time in a bar anyway.
- Comment on i will never understand scientific fraud 5 months ago:
Yeah, if you’re foolish enough to go into research, you still have to pay rent.
- Comment on Tensors 5 months ago:
Well, it would still be a vector. So some standardisation.
- Comment on Why I Haven't Seen Any Trump Supporters In Fediverse (Lemmy and Mastodon)? 5 months ago:
I point your attention wolfballs.
I may not agree with most of the perspective, but the author’s opposition to censorship is admirable.
Yeah free speech isn’t always free, but I’d rather the freedom to read things I disagree with. Others may disagree though.
- Comment on Star Wars Outlaws - Review Thread 5 months ago:
I really enjoyed bad batch, clone wars, rogue one and Andor
- Comment on What is the actual point of a bra? 7 months ago:
My favourite part of threaded platforms is the arbitrary and tangential discourse
- Comment on How does the xz incident impacts the average user ? #xz 10 months ago:
What about vpn behind WireGuard/OpenVPV?
I would presume no?
- Comment on Expertise 11 months ago:
Australia, New Zealand, Europe, Asia.
I’ve never heard of Masters for PhD? Coursework is opposite direction?
- Comment on Please Stop 11 months ago:
Or maybe I need product X to get by day to day but I can’t afford a health insurance plan.
It’s really not as simple as most people make it out to be.
- Comment on Please Stop 11 months ago:
Well I use Bitcoin everyday and I’m grateful for it.
Banks don’t support the transactions I need to make.
- Comment on Chad scraper 1 year ago:
Imagine an investment firm looking at a property market. They need data like price trends in the surrounding area.
Real estate API is expensive, scraping is free. By hiring an employee the can save money.