Many of us, particularly if we have been programmers, have got into the habit of
regarding computers as flawless execution engines. People with more of an
electronics background tend to be a bit more sceptical, I think.
I’ve been trying to figure out why I couldn’t burn a Fedora 11 DVD to upgrade
one of my oldest machines for several months now. I had checked the SHA-256
hash of the download then copied the file from the server where I run BitTorrent
across to a desktop machine’s external hard drive. The burned disk verified
against the image on the machine that created it but the installation self-test
always failed, claiming the disk was corrupt. I tried burning from the same
image on another machine; I tried burning at different speeds; I tried different
blank DVDs. No change.
Finally, today, I thought to try verifying the hash on the copied image rather
than the original one. It was different. Comparing the original download with
the copy, I discovered two locations in the copy where byte 0x12 of a block had
dropped the 0x08 bit.
It’s probably not a coincidence that the machine on which I made the corrupted
copy has recently come back from a couple of extended “warranty repair” holidays
during which first the main system logic board and then (at my strong and
repeated insistence) the actual DRAM were replaced. The machine had been having
some intermittent problems involving applications shutting down unexpectedly;
these looked like memory issues to me but the manufacturer’s diagnostics had
always given it a clean bill of health. As an old-school computer guy, of
course, I know that the manufacturer’s diagnostics never detect real memory
issues.
The moral of the story? I’m not sure there is one: “faulty hardware sometimes
gives the wrong answer” seems rather an obvious thing to say. On the other
hand, if you are aware of the concept of
metastability in
electronics, you know that there’s no such thing as perfect hardware as long
as the logic needs to talk to the outside world. So we can reduce the frequency
of odd weirdness to the point where we never expect to encounter it, but we can
never make it go away altogether.