recently there has been this problem that has been getting more frequent, my computer just randomly freezes up/blackscreens and then fails to post when i do a hard restart. this doesn’t resolve itself until after i open it up and play musical chairs with the ram for a bit.

shit that i have tried:

  1. swapped the ram around to different slots. sometimes it works, sometimes it doesn’t
  2. cleaned out the case
  3. wd40’d the ram pins (helped with the posting but seems to have increased crash frequency, not enough data to tell for sure)

no idea where to begin with this one, can’t tell if it’s a motherboard or a ram issue or something else entirely. the sticks are of differing sizes and manufacture so that may also be an issue. would give specs but the thing just died on me in the middle of posting this and i can’t boot in just yet. motherboard is a supermicro x9 something server board.

13 points

Don’t put WD-40 on the pins. I’d start by pulling out the sicks and cleaning the pins off with a q-tip and iso alcohol. Probably a good idea to clean out the slots now too.

Get Memtest64 and run it with both sticks. If it fails try it with each one by itself. If a stick doesn’t past the test you should be able to get a new one under warrenty. Just start an RMA request and say it failed memtest64.

If its not your ram then its probably a poorly seated CPU. Remove the cooler, clean the paste off and carefully put the cooler back on without over tightening it, or tightening one side more than the other.

permalink
report
reply

cleaning the pins off with a q-tip and iso alcohol

i tried this at the beginning, things didn’t noticeably improve so i took it to a local shop and they gave me the wd40 treatment. will try again

probably a poorly seated CPU

please let this be it

permalink
report
parent
reply

It is wild to me that they put WD-40 on it. It’s a lubricant, not a solvent; it will leave residue behind. Regular WD-40 shouldn’t get anywhere near PC components, and the specific stuff they make for cleaning electrical contacts has a bunch of warnings and cautions that would keep me from using it on anything delicate or expensive.

permalink
report
parent
reply
5 points

WD40 isn’t a lubricant, it’s for “Water Displacement.” While as a liquid it can be used as one, it is a poor one. It’s whole purpose is to cover a metal part with a hydrophobic layer. It’s good at removing water from something like your sparkplugs. Maybe they thought water had gotten in and was causing issues with contact?

permalink
report
parent
reply
3 points

Seconding this. Get some 90% isopropyl and clean off all that WD-40. Let it fully dry/evaporate. The only thing you should spray on your computer parts is compressed air.

permalink
report
parent
reply
6 points

A few things to try out in addition to other folks’ good suggestions:

  • when it happens, after a hard shutdown, unplug the power cable, press the power button to discharge anything remaining, and then plug it back in and start. See if it consistently posts after you do this. This would indicate that a component is breaking itself but resets to a temporarily working state after a proper power cycle.

  • monitor temperatures. Log them to file if possible. Overheating components might explain why workarounds only work sometimes. Maybe some of them just let the components cool down enough.

  • just leave in one stick at a time and see how it goes. You can try to narrow down whether it’s a stick or a spot that’s broken by trying different slots with 1 stick and different sticks in the same spot.

  • Not posting can look like a few things. Is it possible it’s the video card / output breaking?

permalink
report
reply
  1. i’ve been doing this when testing each individual stick of ram, there is no real pattern, but some stick/slot combinations are more consistent than others.

  2. will try this when i get the thing to turn on.

  3. see 1

  4. how would i test/fix this? nvidia-smi was fine last i checked. would this have any correlation with the ram issues?

permalink
report
parent
reply
3 points

If you’ve tested each stick all by itself (no others plugged in) in a few different slots and all of them have this issue, that suggests that it’s not the sticks and possibly not the slots either. If it were one of those two options you’d expect to be able to find one stable single stick + slot option, as you’d think that only one would break at a time. One stick breaking or one slot (or single pair of slots).

For your graphics card, do you also have an integrated one in the CPU? If so, I’d remove your discrete card and see if it’s more stable. You’d need to switch your monitor cable to a different receptacle, of course. If that’s not an option, I’d come up with ways to “ping” your computer under the assumption that maybe it is posting and working but just not showing you anything. You could set up an ssh server or similar and auto-login and see whether you can still get in after one of these incidents and a hard reset

The inconsistency of the memory issue makes new think it isn’t memory (no single stick at a time is stable in any slot, right?). I’d start removing more components to see if any minimal set is stable.

permalink
report
parent
reply
4 points

try using with single stick maybe? if it crashes try a diff one if that still crashes its most likely not ram.

permalink
report
reply

i was on 6 sticks, i think i have narrowed the candidates down to 3 stable sticks, 2 unstable, and 1 definitely busted

problem is the stable sticks only work in certain slots and even then uptime is not great

one of the unstable sticks is brand new, makes me think that it got destroyed by being in one of the bad slots

a big problem is that i have 16 slots for ram and it’s a total pain in the ass to test all of them

permalink
report
parent
reply
3 points

if sticks of ram are only working in certain slots its entirely possible the IC that controls the ram is shot.

Recently had this happen on an old dual xeon setup, rendered half of my 192GB of ram unusable and was causing problems exactly like what you’re describing.

Does the mobo show the sticks as inserted upon bootup?

permalink
report
parent
reply

Is it overheating maybe?

permalink
report
reply

My cooler failed not long ago, the symptoms there were similar. Computer would freeze/crash and then wouldn’t turn on until it cooled off.

permalink
report
reply

technology

!technology@hexbear.net

Create post

On the road to fully automated luxury gay space communism.

Spreading Linux propaganda since 2020

Rules:

  • 1. Obviously abide by the sitewide code of conduct. Bigotry will be met with an immediate ban
  • 2. This community is about technology. Offtopic is permitted as long as it is kept in the comment sections
  • 3. Although this is not /c/libre, FOSS related posting is tolerated, and even welcome in the case of effort posts
  • 4. We believe technology should be liberating. As such, avoid promoting proprietary and/or bourgeois technology
  • 5. Explanatory posts to correct the potential mistakes a comrade made in a post of their own are allowed, as long as they remain respectful
  • 6. No crypto (Bitcoin, NFT, etc.) speculation, unless it is purely informative and not too cringe
  • 7. Absolutely no tech bro shit. If you have a good opinion of Silicon Valley billionaires please manifest yourself so we can ban you.

Community stats

  • 1.7K

    Monthly active users

  • 5K

    Posts

  • 61K

    Comments