Another source: Intel is selling defective 13-14th Gen CPUs
[Gamers Nexus] Intel's CPUs Are Failing, ft. Wendell of Level1 Techs
Submitted 4 months ago by theangriestbird@beehaw.org to technology@beehaw.org
https://www.youtube.com/watch?v=oAE4NWoyMZk
Comments
theangriestbird@beehaw.org 4 months ago
Kissaki@beehaw.org 4 months ago
Over the last 3–4 months, we have observed that CPUs initially working well deteriorate over time, eventually failing. The failure rate we have observed from our own testing is nearly 100%, indicating it’s only a matter of time before affected CPUs fail.
damn
narc0tic_bird@lemm.ee 4 months ago
They “have to” push their current silicon beyond its limits just to keep up with AMD (especially X3D in gaming workloads).
They pushed too far, big time.
The only right thing to do here would be to offer a full refund of the original purchase price of the CPU and mainboard to all customers, stop selling affected models immediately and release revisions that aren’t unstable and rapidly degrading by default.
But this won’t happen of course.
averyminya@beehaw.org 4 months ago
Man I started getting nervous because I bought a bunch of parts to upgrade my partners PC. Couldn’t remember what Intel CPU I got cause I’m not as familiar with them.
12600KF, I’m safe phew.
On the topic, this is sad to hear because I’ve been waiting for the hat to drop on Intel’s turn-around. Moving to stateside manufacturing, the developments of some of the new tech that is available I’ve felt like they’re somewhat well poised to start shifting their lackluster goals and performance stagnation.
The news of this muddles that feeling a bit for me. Issues like this, especially if they are known beforehand and shipped out anyway, speak to a wider issue in the company.
tal@lemmy.today 4 months ago
According to the video, the behavior the interviewed guy sees is that some CPUs are affected and some apparently aren’t. He had a very high rate, ~50% of CPUs that were affected, but that the ones that weren’t just appeared to work normally. So even if you had an affected model, you might not be affected.
averyminya@beehaw.org 4 months ago
What a crazy defect with such a high prevalence. That’s really pretty crazy
dan@upvote.au 4 months ago
He had a very high rate, ~50% of CPUs in systems that he looked at were affected
Note that I think this was with the data center cohort, which run the systems 24/7. The prevalence isn’t as high with regular consumer use (but still way too high). The data centers also didn’t have any problems at all with the 12900K.
floofloof@lemmy.ca 4 months ago
It sounds like these CPUs may be degenerating with use, so some can start out good and then turn bad after a few months. You’d never be sure whether you had a good one or just a bad one that hadn’t revealed itself yet.
onlinepersona@programming.dev 4 months ago
I’m glad this is hitting Intel and not AMD. The market needs more AMD marketshare. Hopefully RISC-V and ARM will become more mainstream in the next 5 years to dethrone Intel.
tal@lemmy.today 4 months ago
I had a 13900K fail after a few months; started seeing errors when anything more than one core was active. Got a 14900K, made sure to turn off the default motherboard settings that were recommended against by Intel before inserting chip. Failed in the same way after a few months.
Switched to an AMD motherboard and processor. Haven’t had any problems. I expect that I’ll continue using AMD processors moving forward unless they put some serious lemons out.
tal@lemmy.today 4 months ago
Also:
22:00:
Yeah, I’m wondering about what kinds of other nasty secondary fallout there will be. One reason that I didn’t want to spend time on this further – was willing to just eat the cost of the motherboard and a pair of CPUs and go AMD – was because I was developing root filesystem corruption just trying to boot with multiple cores, and I didn’t want to experiment on that further. It’s just not worth it to me as an individual dealing with a dicked-up filesystem to try to track down a piece of bad hardware. Like, there’s going to be unpleasant fallout out there with other people from data loss when a lot of CPUs are garbling data somehow.
averyminya@beehaw.org 4 months ago
AMD has been really solid. I’ve built a number of PC’s and there’s I’ve never run into an issue with the CPU’s. the R5 2600, 3600, and R7 5800 and 5800x are all surprisingly efficient chips out of the box, but I played around with each and found even crazier undervolt settings. My server PC draws practically nothing except if there’s something using the (NVIDIA) GPU extensively (and even then it’s like, oh no, is it almost 75 watts? better call the fire brigade! lmao).
And obviously the R7 5800x is just a monster, although I’ve consistently seen that it runs hot but… I air cool mine and it’s never really going above 85c when under full load on stock, and if you play with undervolting at all it’s pretty easy to keep the exact same performance while lowering the total power delivered. Although I’ve found that it goes up to 85c still and the chip just runs faster…
tal@lemmy.today 4 months ago
I mean, I don’t hate Intel – I’ve used exclusively their systems for, I dunno, maybe 25 years. And as Linus says in the video, it’s not as if AMD has never had hardware problems on their CPUs. But this is a pretty insane dick-up on Intel’s part. Like, even if I’m generous and say “Intel had a testing regimen that these passed, because failures didn’t show up initially”, Intel should also have had CPUs that they kept running. They maybe didn’t know the cause. They maybe didn’t have a fix worked out or know whether they could fix it in software. But they should have known partway through the production run that they had a serious problem. And when they knew that there was a serious problem, my take is that they shouldn’t have been continuing to sell the things. I mean, I would not have picked up the second processor, the 14900KF, if I’d known that they knew that two processor generations were affected and they didn’t have a fix yet. Like, sure, companies make mistakes, can’t completely eliminate that, but they should have been able to handle the screw-up a whole lot better than they did.
I don’t think that this is cooling, and the video talks about the thing too. I initially suspected that cooling might somehow be a factor (or power), given that one of the use cases that I could eventually get to reliably trigger problems for me was starting Stable Diffusion, was inclined to blame power or possibly heat somehow. But the video says no, they logged thermal data and the systems are running very conservatively. And I kept an eye on the temperatures the second time from the get-go.
It looks like the 5800x has a TDP of 105W.
I switched to a 7950X3D, which has a TDP of 120W, but on both the Intel processors and the AMD one, was using one of these water coolers (which was definitely overkill on the AMD CPU). Never used water-cooling before this system – was never something that I’d consider necessary until the extreme TDPs that the recent Intel processors had – but it does definitely keep the processor cool.