Every layer of computing has a failure story. Every layer has an engineering response. That pattern is the point.
On September 9, 1947, engineers at Harvard found a moth stuck between the relay contacts of the Mark II computer. They taped it into the logbook and wrote: "First actual case of bug being found."
Grace Hopper and the team helped popularize the terms "bug" and "debugging," though Edison had used "bug" for technical problems as early as 1878. What matters is not who coined the term but what the incident represents: when a physical machine encounters a physical problem, you open it up, find the cause, document it, and fix it.
The Mark II had banks of electromechanical relays. Relays are switches. A moth bridging the contacts of a relay is a short circuit. The fix was simple: remove the moth. The principle was profound: errors happen, and the practice of systematically finding and documenting them is as important as the engineering itself.
In 1994, Professor Thomas Nicely at Lynchburg College noticed that his Pentium processor was giving slightly wrong answers for certain floating-point division operations. The error could appear as early as the fourth significant digit.
The cause: Intel's Pentium used an SRT division algorithm with a lookup table of 2,048 cells, of which 1,066 needed to contain specific values. Five cells that should have contained +2 instead contained zero. When the division algorithm accessed those five cells, it produced incorrect results.
The probability of hitting the bug with random inputs was approximately 1 in 9 billion. But for specific inputs, the error was reproducible and significant. The classic test: divide 4,195,835 by 3,145,727. The correct answer rounds to 1.3338. The Pentium returned 1.3337.
In 2003, an electronic voting machine in Schaerbeek, Belgium, gave one candidate exactly 4,096 extra votes. That number, 2 to the 12th power, was the clue. A single bit had flipped in the machine's memory, adding 4,096 to a vote count.
The error was caught only because the total votes exceeded the number of eligible voters. A recount confirmed: one bit, position 12, had changed from 0 to 1. All other counts were unchanged.
The probable cause: a cosmic ray. High-energy particles from space collide with atoms in the atmosphere, producing secondary particles that can strike computer memory and flip individual bits. This is called a "soft error" or "single-event upset."
In 2013, a speedrunner playing Super Mario 64 executed a jump in the Tick Tock Clock level, and Mario suddenly rocketed through the floor. This was not a known glitch. It was not reproducible.
Analysis by the speedrunning community traced the anomaly to a single bit change at memory address 0xC5837800. The binary value changed from 1100 0101 to 1100 0100, altering Mario's vertical position. One bit, flipped by a cosmic ray, teleported a video game character through a virtual floor.
More common than you think. IBM estimated in 1996 that a desktop with 256 MB of RAM would experience approximately one cosmic ray bit flip per month. Computers at airplane cruising altitude experience bit flips 300 times more frequently. Modern systems with 16 GB of RAM face proportionally higher risk due to denser memory.
Server hardware uses ECC (Error-Correcting Code) memory, which detects and corrects single-bit errors automatically. Consumer hardware, your laptop, your phone, your Nintendo 64, generally does not.
This is a byte of memory storing the number 77. Click "Cosmic Ray" to flip a random bit.
Moths. Missing lookup values. Cosmic rays. At every layer of the computing stack, something can go wrong. Something does go wrong. The response is always the same: detect the error, handle it, build systems to prevent it or work around it.
Parity bits. Checksums. ECC memory. Error-correcting codes. Redundant systems. NASA runs computations in triplicate and uses majority voting to override flipped bits.
This is not a flaw in computing. This is computing. The discipline is, at its core, the engineering of reliability from unreliable substrates.