Comment on The right FUCKING time to get TWO ram sticks damaged
Wispy2891@lemmy.world 4 days agoNo, even tried to run them at 1866…
Comment on The right FUCKING time to get TWO ram sticks damaged
Wispy2891@lemmy.world 4 days agoNo, even tried to run them at 1866…
tal@lemmy.today 4 days ago
Ah, fair enough. Long shot, but thought I’d at least mention it on the off chance that maybe it would work and maybe you hadn’t yet tried it. Sorry.
tries to think of anything else that could be done
Are you using Linux? Linux has a patch that was added many years back with the ability to map around damaged regions in memory. I mean, if your memory is completely hosed and you can’t even boot the kernel, then that won’t work, but if you can identify specific areas that fail, you can hand that off to the kernel and it can just avoid them. Obviously decreases usable memory by a certain amount, but…shrugs
I’ve never needed to do it myself, but let me go see if I can find some information. Think it was the “badram” feature.
searches
Okay. You’re running memtest86. It looks like that has the ability to generate the string you need, and you hand that off to GRUB, which hands it off to the kernel.
memtest86.com/blacklist-ram-badram-badmemorylist.…
If you can’t even boot the system sufficiently to get
update-grubto run, then you might need to do a fancier dance, but that’s probably a good first thing to try.Wispy2891@lemmy.world 4 days ago
wow i’m running linux, so it might be perfect
though i’m a bit scared that it will get worse over time. Today i got a freeze that forced me to test the ram with memtest86, but since september i got some random corruption in the btrfs filesystem (luckily always “useless” files like flatpak or docker stuff that i could delete and download again in seconds) and i assumed it was a btrfs bug, not hardware problem
COASTER1921@lemmy.ml 4 days ago
If I were in this position I’d strongly consider using 16GB for the next year or two if you have a working stick. Especially with an NVME SSD, good swap performance makes the impact of running out of memory much smaller than it used to be.
justlemmyin@lemmy.world 4 days ago
I had to do this on my busted ddr4 2 weeks ago. Badram didn’t work, but memmap did. I had to do bit flipping to get the translation from BADRAM as explained here.
I think the latest memtest86+ has the option to report in memmap format. But you will need to take a photo of the screen, coz it’s Foss and not as fancy as Passmarks memtest.
chellomere@lemmy.world 4 days ago
You can even make linux run an automatic memtest on boot and reserve the bad areas it finds. This is with the memtest=N kernel parameter, where N is the number of passes. memtest=17 tests all patterns. With this, the kernel will run an automatic test on every boot.
SlurpingPus@lemmy.world 4 days ago
To add to what the above commenter said: afaik Grub allows specifying kernel parameters at boot by pressing some hotkey. You could type in the string from memtest86 if you find what the parameter should be called (or add the
memtestparameter instead).MigratingApe@lemmy.dbzer0.com 4 days ago
Let’s say that you would be surprised if we actually started checking this. I will not disclose my occupation but there are thousands of critical telco infrastructure pieces of equipment that run not only a non-ECC ram because of cost cutting, but with actually broken DRAM modules, regularly rebooting at least a few times a day and causing local outages…
Back to the topic at hand - doesn’t it seem strange that only CPU4 finds issues in memtest86? It could be a CPU or even motherboard that got damaged and not the DRAM itself, no?
tal@lemmy.today 4 days ago
I noticed that, but OP said that he ran the thing in three different systems, so I’m assuming that he’s seen the same problems with multiple CPUs. It may be — I don’t know — that memtest86 doesn’t, at least as he’s running it, necessarily try to hit each byte of memory with each CPU, or at least that the order it does so doesn’t have errors from other CPUs visible.
I also wondered if it might be a 13th or 14th gen Intel CPU, the ones that destroyed themselves over time. But (a) it’s a mobile CPU, and only the desktop CPUs had the problem there, and (b) it’s 11th gen.