Skip to main content
Home Forums PPC G5 Halts, Panics, Crashes - heat related? PPC G5 Halts, Panics, Crashes - heat related?
Thread

PPC G5 Halts, Panics, Crashes - heat related?

PPC G5 Halts, Panics, Crashes - heat related? Hardware 32 posts Oct 7, 2009 — Oct 15, 2009
Finally, it was a big mistake for Apple not to include ECC memory support in at least the "Pro" desktops. IBM has good engineers, but they come from a background that assumes that anything with a large amount of RAM uses ECC as an engineering best practice. Apple felt they "needed" to save the 15% or so ECC adds to the memory price tag so they special ordered a Northbridge not supporting it for the desktop machines. (Smart move, guys.) Intel *should* probably push PC makers into using ECC in their desktop machines, but since they don't they've gotten pretty good at making memory controllers reliable enough to do without it.
I'm not sure how relevant ECC is to desktop computers or even workstations. Current usage of such devices is primarily for "light computing" with the odd bit of audio and video processing. For those use cases, having an occasional bit flip (which we know is extraordinarily rare) is unimportant to the work that is being generated. Having RAM that causes the computer to crash and burn in the extreme failure case may provide a better indication to a normal user than incomprehensible system software logs.

For servers, ECC makes a lot of sense. If you're holding financial records or any sort of database in RAM, a bit flip can be expensive. And you're paying for admins or a software package to monitor your server, aren't you?

If you put ECC memory controllers in desktop PCs, what sort of quality control do you expect? I guess that Intel would do a good job overall (some of their ICH chipsets have problems, if you remember) and there'll be some horrible cheap offerings.

If you put ECC memory controllers in desktop PCs, what sort of quality control do you expect? I guess that Intel would do a good job overall (some of their ICH chipsets have problems, if you remember) and there'll be some horrible cheap offerings.
The whole point is that if you have ECC you can tolerate lousy quality control better then you can without it. ;^)

As mentioned here, the estimates for what the "average" rate of bit errors are vary over an enormous range, IE:

Recent tests {[11],[4],[5]} give widely varying error rates with over 7 orders of magnitude difference, ranging from 10-10 to 10-17 error/bit·h, roughly one bit error, per hour, per gigabyte of memory to one bit error, per century, per gigabyte of memory.
Obviously it's not much of a problem if it's the high end, but if we assume it's closer to the lower end *and* assume that the error rate can be materially impacted by the quality of electrical design/manufacturing workmanship of the individual computer, then it starts becoming materially important whether a system has ECC or not. If a high-quality system has a bit randomly flip once a month then it's highly unlikely there will be any problems from it. Worst case it crashes or corrupts a file, but it's more likely it's going to be rebooted or the memory block will be overwritten before there's a problem. On the other hand, if a system with an "iffy" design that uses bleeding-edge or low-production hardware flips a bit once every few hours there's a decent chance of getting hurt by it. That chance decreases by an *enormous* magnitude if properly-implemented ECC is in use. (Not only do you have to get two bit errors at once, they have to be in the same 64 bit word. In, say, 4 GB of RAM in theory there's a one in 500 million chance of that happening over the space of two consecutive bit errors.)

The PowerMac G5s were supposedly "Pro" hardware. Even Dell's cheapest line of Precision workstation offers (and offered at the time) ECC support at least as an option. (The higher end models always include it.) Even if its real value is somewhat oversold it's still "cute" that Apple didn't even offer as an option what everyone else playing in the same ballpark considered standard. And of course the Mac Pro uses ECC... proof that Intel knows in its black little heart you need it for large memory applications, perhaps?

Anyway.

mp.ls