The relay race: how the computer was built
A century and a half of bottlenecks, batons, and the humans in between — from Babbage to Blackwell.
Sometime around 1821, in a cold upstairs room in London, a young mathematician named Charles Babbage sat across a table from his friend John Herschel. Between them lay sheets of astronomical calculations, worked out twice by two different human clerks so the errors could be found. The errors kept coming. Ships were running aground because of mistakes in the nautical tables. Fortunes were lost to mis-tallied actuarial figures. Artillery officers were killing their own men with faulty ballistic charts. Babbage, staring at yet another discrepancy, lost his patience and said the sentence that, in retrospect, begins the story of the modern world: “I wish to God these calculations had been executed by steam.” Herschel, unruffled, replied, “It is quite possible.”
Two centuries later, the descendants of that wish sit in your pocket. A phone the weight of a deck of cards contains more transistors than there were grains of sand in Babbage’s hourglass. Yet the line from his steam-driven dream to your glass rectangle is not a straight one, and it is not a smooth arc of progress. It is a relay race, six legs long, in which each runner solved a bottleneck that the previous runner created. Gears hit a wall, so electricity ran the next lap. Electricity hit heat, so silicon took the baton. Silicon hit speed, so parallelism sprinted next. What follows is that race, told as it happened — with its visionaries and eccentrics, its traitors and war heroes, its accidents and vacations and a single moth.
Act I — The age of mechanics
Before the 19th century, the word computer described a job, not a machine. Computers were people — usually women, often in teams — who worked through equations with pencil, paper, and fatigue. They filled nautical almanacs, insurance tables, astronomical ephemerides. They made mistakes. Tables were wrong because clerks were tired; because typesetters mis-set the lead; because printing presses smudged. One slip in a logarithm could put a ship on a reef. Europe’s great brains had been chipping at the problem for centuries. John Napier gave them logarithms and ivory “bones.” Blaise Pascal, at nineteen, built a brass box of interlocking wheels for his tax-collector father in Rouen. Leibniz designed a stepped drum that could multiply — though the one surviving specimen in Hanover never quite worked. The only sturdy error-check available to humanity was to have two clerks compute the same table and compare. It doubled the cost and caught only some of the mistakes. This was the world Babbage was staring at.
Babbage and the mill
Babbage was a Cambridge-bred polymath, rich, prickly, a co-founder of the Royal Astronomical Society, a designer of cowcatchers, a reformer of the British postal system, a man who waged a lifelong vendetta against organ-grinders for disturbing his concentration. His first machine, the Difference Engine, proposed to the Royal Astronomical Society on 14 June 1822, was a marvel of finite-difference arithmetic: 25,000 parts, eight feet tall, designed to print its own answers on stereotype plates so no human typesetter could corrupt them. The British government funded it with what eventually became £17,500 — about the price of twenty-two new Stephenson steam locomotives. Babbage hired the finest toolmaker in England, Joseph Clement, a gruff, self-made artisan with no polish and no patience. For a decade the two men pushed precision manufacturing to its 1820s limit. Then they fought. Clement demanded that the tools he had built (on Babbage’s money) were, by trade custom, his. In 1833 he downed tools. Construction never resumed. The 12,000 precision parts already made were eventually melted for scrap.
But out of that ruin came a stranger and grander idea. By 1837 Babbage had designed a machine he called the Analytical Engine — not a calculator but a computer. It had, in his words, a “mill” that did arithmetic (the CPU), a “store” that held a thousand fifty-digit numbers (the memory), and input-output through punched cards he had borrowed, openly, from the silk looms of Lyon. It could loop. It could branch on a condition. It could, in principle, do anything we today call computing. Babbage never built it. The Science Museum proved, in 1991, that with the tolerances available in Babbage’s day his machine could have worked. It was not physics that stopped him. It was people — politicians, engineers, funders, Parliament.
On the question of whether the machine could correct bad input, Babbage left a famous retort to the MPs who had asked him whether wrong numbers going in produced right answers coming out: “I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question.” Every programmer since has been staring into the same confusion.
The loom that thought
The punch cards Babbage stole were already old by the time he found them. In 1804 in Lyon, Joseph Marie Jacquard, a silk-weaver’s son turned revolutionary soldier turned inventor, had automated the most laborious part of silk weaving — the drawboy who sat atop the loom pulling threads according to a pattern. Jacquard chained stiff punched cards together so that holes and solid cardboard selected which warp threads lifted. Patterns that once took hours now took minutes. Napoleon made the loom public property in 1805 and thereafter wore its products as ceremonial robes.
In 1839, a Lyon firm named Didier, Petit et Cie commissioned the most astonishing piece of Jacquard weaving ever made: a portrait of Jacquard himself, rendered in black silk at 1,000 threads per inch, encoded across 24,000 punched cards, each card carrying over a thousand hole positions. Visitors thought it was an engraving. Babbage owned one, hung it in his London drawing room, and used it to explain his Engine to visitors — among them the Duke of Wellington and Prince Albert. Jacquard cards carried no computation. They carried instructions. That was enough. The cards were the first code.
The countess who saw further
Babbage demonstrated his fragment of Difference Engine at parties. At one of them, in June 1833, a seventeen-year-old girl named Augusta Ada Byron saw the brass wheels turn. She was the daughter of Lord Byron, the rock-star poet who had fled England forever five weeks after her birth. Her mother Annabella, whom Byron had nicknamed “the princess of parallelograms,” had raised Ada on a mathematical diet explicitly designed to exorcise any Byronic tendency toward madness. The education worked — and produced something neither parent anticipated. Ada, who called her own mind “poetical science,” refused to choose between her parents’ worlds.
A decade later, at Babbage’s urging, Ada (now the Countess of Lovelace) translated an Italian paper on the Analytical Engine and appended her own Notes, three times longer than the paper itself. Note G contained a step-by-step procedure for computing Bernoulli numbers on a machine that did not yet exist. It is reasonably called the first computer program. But Lovelace’s real leap, buried in Note A, went further than Babbage had dared:
“The Analytical Engine might act upon other things besides number… the engine might compose elaborate and scientific pieces of music of any degree of complexity or extent.”
She had seen, a century before Shannon, that computation is not arithmetic. Computation is the manipulation of symbols, and numbers are only one species of symbol. She died of uterine cancer at thirty-six — the same age as her father — and asked to be buried beside him. Babbage called her his “enchantress of numbers.” For the next hundred years almost no one read her notes.
Hollerith and the census that didn’t fit
The mechanical baton was passed from the parlor to the bureau. The United States census of 1880 had taken seven and a half years to tabulate by hand; officials calculated that at the current growth rate the 1890 count would not finish before 1900 began. A young engineer named Herman Hollerith, dating his mentor’s daughter and moonlighting for the Census Bureau, had been chewing on the problem. While riding a train west, he watched a conductor perform what railroads called a “punch photograph” — holes punched in a passenger’s ticket marking hair color, eye color, height, to defeat ticket fraud. Why not punch people into cards?
Hollerith’s 1890 machine was a small revolution. Census clerks punched holes into cards sized exactly to fit existing Treasury Department cabinets (the size of a dollar bill). A reader lowered metal pins; where a hole lay, the pin dipped into a cup of mercury, completed an electric circuit, and advanced one of forty clock-face dials. A bell rang for each card. A twenty-four-drawer sorting box flipped open lids based on demographic criteria — an operator could, in a single run, find all the Norwegian-born people in Minnesota. A century before anyone said “binary,” Hollerith’s tabulator was using on/off electrical states to do digital work. The 1890 count came in six weeks; the full tabulation, a little over a year. The federal government saved an estimated $5 million. The Electrical Engineer wrote that the apparatus worked “unerringly as the mills of the gods, but beats them hollow as to speed.” Hollerith’s company, which began as the Tabulating Machine Company in 1896, merged into a conglomerate in 1911 and was renamed, in 1924, by Thomas J. Watson Sr.: International Business Machines.
The ceiling of the mechanical
By the 1930s the mechanical paradigm was walking into walls. Gears wear. Cams warp. Shafts friction. Every rotation of mass costs time you cannot buy back. A roomful of Hollerith machines could clatter at 150 cards per minute and no faster. Babbage’s Analytical Engine, had it been built, would have cycled a few times a second. The first runner had run her leg and was gasping for air. Somewhere on the edge of physics, the baton was already moving to electrons.
Act II — The age of electricity
The insight that would power the next leg was philosophical before it was engineering. Computation is about state change, not motion. A gear in position 7 and a switch in position “closed” represent the same information. But a switch can be flipped by electromagnetism in milliseconds, and an electronic valve in millionths of a second, while a gear has to move its own inertia. The 1930s and 40s were a staggered migration from mass to charge — first through electromechanical relays, clanking switches driven by electromagnets; then through vacuum tubes, glowing glass bulbs that could switch without any moving parts at all.
Three men in a living room, two at a bombe, a thousand in a war
In 1936, in his parents’ Berlin apartment on Wrangelstraße, a 26-year-old civil engineer named Konrad Zuse, sick to death of computing aircraft-structural equations by hand, quit his job to build a mechanical computer in the family living room. He cut 20,000 thin metal sheets with a jigsaw. His sister helped. The Z1 of 1938 was binary, floating-point, programmable with 35-mm movie film punched with holes — the first freely programmable binary computer on Earth. It jammed constantly. Zuse replaced its calculator with 600 relays salvaged from the Reichspost and called it Z2; in 1941 he built the Z3, with 2,600 relays, 5 Hz, a real working machine. He proposed a 2,000-tube vacuum successor to the Reich Air Ministry and was refused — the war, the officials said, would be won before it mattered. Allied bombs destroyed the Z1, Z2, and Z3 in 1943–45. Only the Z4, bundled onto a truck and driven to an Allgäu village, survived.
In Britain a Cambridge logician named Alan Turing had already written the mathematical poem that made all of this make sense. In summer 1935, lying in Grantchester Meadows, he conceived an imaginary device: an infinite tape, a read-write head, a finite list of rules. He showed that such a machine, if given the description of any other such machine, could simulate it — the Universal Turing Machine. The paper, “On Computable Numbers,” received by the London Mathematical Society in May 1936, accomplished three things at once. It defined what it meant for a number to be computable. It proved that David Hilbert’s 1928 “decision problem” — whether mathematical truth could be mechanized — had no solution. And without intending to, it described the blueprint for every computer that would ever be built. One machine. Many programs. Universality.
In 1937, at MIT, a 21-year-old electrical-engineering and math double-major named Claude Shannon realized something nobody had bothered to notice. For ninety years mathematicians had been using a dusty algebra invented by the self-taught Lincolnshire schoolmaster George Boole — an algebra with two values, 0 and 1, and three operators, AND, OR, and NOT. Shannon had studied Boolean logic as an undergraduate. At MIT he maintained Vannevar Bush’s Differential Analyzer, whose control circuit had about a hundred electromechanical relays. He summered at Bell Labs watching telephone switches route calls. One summer the fact hit him: a switch is a Boolean variable. Two switches in series are AND; two switches in parallel are OR; an inverter is NOT. Any Boolean expression — and therefore any digital arithmetic — could be built out of relays. His master’s thesis, A Symbolic Analysis of Relay and Switching Circuits, was filed in August 1937. Howard Gardner later called it “possibly the most important master’s thesis of the century.” Before Shannon, digital-circuit design was an art. After Shannon, it was engineering.
Three strands — Turing’s mathematics, Shannon’s logic, and the urgent problem of Nazi ciphers — braided together at a Victorian country estate called Bletchley Park. Turing arrived there on 4 September 1939, the day after Britain declared war. He led Hut 8, which broke the German naval Enigma. His Bombe machines — wardrobe-sized electromechanical deduction engines — cracked U-boat signals by the hundreds a day. Historians estimate they shortened the war by two to four years and saved roughly fourteen million lives. Turing chained his tea mug to a radiator to stop theft, cycled in a gas mask to fight hay fever, and buried silver ingots in Shenley Woods that he later couldn’t find.
But the Enigma was not the hardest cipher at Bletchley. The Lorenz, used for Hitler’s communications with his generals, was harder by orders of magnitude. A working-class Post Office engineer named Tommy Flowers, the son of a Poplar bricklayer, had an unorthodox faith in vacuum tubes — believing that if you never turned them off, their filaments would last. His superiors refused to fund his design. Flowers built it anyway on the Post Office’s dime. Colossus, Mark 1, completed November 1943, had about 1,500 thermionic valves; Mark 2, with 2,400 valves, came online 1 June 1944 — five days before D-Day. It read paper tape at 5,000 characters per second, the first fully programmable electronic digital computer on Earth. A courier carried its decrypts to Eisenhower at SHAEF on 5 June; Colossus had confirmed that Hitler still believed the invasion would come at Calais. Eisenhower handed the paper back: “We go tomorrow.”
By war’s end, ten Colossi ran around the clock, operated by 273 women of the Wrens, sworn under the Official Secrets Act to tell anyone who asked that they did secretarial work. On Churchill’s orders the machines were smashed to pieces no larger than a hand and the blueprints burned. Colossus remained classified until 1975. Tommy Flowers, refused reimbursement for his own savings that had paid for the project, could never get a postwar bank loan because he could not explain what he had done in the war. This is why, for decades, a different machine got all the credit.
ENIAC and the six women who were called “models”
At the University of Pennsylvania’s Moore School, under Army contract, John Mauchly and J. Presper Eckert were building a monster. The Ballistic Research Laboratory at Aberdeen needed firing tables; a single trajectory took a human computer with a mechanical calculator twenty to forty hours. ENIAC — the Electronic Numerical Integrator and Computer — was unveiled to the press on 14 February 1946. It had 17,468 vacuum tubes, weighed thirty tons, sprawled across 1,800 square feet in a U-shape, consumed 150 kilowatts, and performed 5,000 additions per second — a thousand times faster than the relay-based Harvard Mark I that had entered service eighteen months earlier. For the press demonstration, the engineers fitted neon bulbs onto the tubes and covered them with painted ping-pong balls cut in half so the cameras could see the computation happening.
The New York Times headline: “Electronic Computer Figures Like a Flash.”
The ENIAC was programmed by rewiring — physically patching cables between units and flipping thousands of switches. The people who did this were six women who had been among Penn’s eighty female human computers: Kay McNulty, Jean Jennings, Betty Snyder, Marlyn Wescoff, Frances Bilas, and Ruth Lichterman. They were initially denied clearance to see the machine and learned to program it from blueprints alone. “Somebody gave us a whole stack of blueprints,” McNulty recalled, “and they said, ‘Here, figure out how the machine works and then figure out how to program it.’” Snyder and Jennings wrote the trajectory program that dazzled the reporters on 14 February. They were not invited to the celebratory dinner. The press photographs captioned only the men. Decades later, when a young researcher named Kathy Kleiman asked a museum who the women in the pictures were, she was told they were models.
The vacuum tube had carried the baton much farther and much faster than relays ever could. But tubes are glass balloons with hot wire inside them. They burn out like light bulbs — at ENIAC’s scale, roughly one every few days, sending technicians through the cabinets hunting for a dark one. They radiate heat; the Moore School lab needed dedicated air conditioning to stay below 120°F. They draw power; legend says Philadelphia’s lights dimmed when ENIAC turned on, a story the Penn archives politely treat as rumor. Above all, they do not scale. A machine with a hundred thousand tubes would fail somewhere before you could finish a calculation. The second runner had sprinted admirably. Her lungs were giving out. The baton had to move again.
The moth, and what came after
On 9 September 1947, at 3:45 in the afternoon, engineers working on the Harvard Mark II — including a young Navy lieutenant named Grace Hopper — found a moth wedged into Relay #70, Panel F. They taped it into the logbook with the caption “First actual case of bug being found.” The entry is a joke, not an origin story; engineers had been calling glitches “bugs” since at least Thomas Edison in 1878. The joke only works because the word was already in use. But the moth is in the Smithsonian, and the story stuck, because three months later an entirely different kind of bug — an electron — would begin to replace the glass and the wire.
Act III — The transistor revolution
At Bell Labs in Murray Hill, New Jersey, the director of research Mervin Kelly had seen the future and hated it. He had spent hundreds of thousands of dollars trying to build better vacuum tubes for the telephone network. They still glowed red. They still burned out. In 1945 he reorganized his physics group around an audacious goal: find a solid-state amplifier. No filament. No vacuum. Just a chunk of crystalline matter doing the work. Kelly designed the new Murray Hill hallways to be long, so that chemists would bump into metallurgists and physicists on their way to the restroom. He hired a brilliant, charismatic, and psychologically difficult theoretical physicist named William Shockley to lead the group. Shockley’s first design, based on applying an electric field to germanium, didn’t work. A quiet, mumbling theoretician named John Bardeen figured out why: electrons were pooling at the crystal’s surface and screening the field.
On 16 December 1947, in Building 1, Room 1E455, the experimentalist Walter Brattain wrapped gold foil around a plastic wedge, slit it at the vertex with a razor blade, and pressed the two gold contacts — fifty microns apart — onto a slab of germanium. Current flowing between the contacts modulated current flowing through the crystal. Brattain’s carpool heard on the drive home that he had just done the most important experiment of his life. Bardeen walked in the door and told his wife Jane, who was cooking dinner, “We discovered something today.” Jane, with the children underfoot, replied, “That’s nice, dear.” On 23 December, in a snowy formal demonstration to Bell Labs’ brass, the device amplified speech without warm-up. Shockley called it “a magnificent Christmas present.”
Shockley was furious that his name was not on the patent. Over the New Year’s weekend, alone in a Chicago hotel room, he worked out the theory of a more manufacturable design — the bipolar junction transistor — and kept it secret from Bardeen and Brattain for months. Bardeen left Bell Labs in 1951 for the University of Illinois, where he would win a second Nobel Prize in 1972 for the theory of superconductivity, the only person ever to win physics twice. Brattain moved to another group and refused to work with Shockley again. The three men shared the 1956 Nobel anyway, and the iconic AT&T photograph — Shockley seated at the microscope, Bardeen and Brattain standing behind him — staged a camaraderie that the principals later called a lie.
The press barely noticed the public announcement on 30 June 1948. The Times buried it on page 46. The name “transistor” — an abbreviated combination of transconductance/transfer and varistor, coined by the sci-fi-writing engineer John Pierce — entered the language.
A tiny faucet
Imagine a garden hose with a handle on it. The water from the main line is your big current, flowing between the emitter and the collector (or, in the later and now dominant field-effect variant, between the source and the drain). The handle is the base (or the gate). A microscopic twist of the handle — a tiny voltage, a trickle of current — controls an avalanche of water behind it. Turn the handle all the way closed: no current, a logical 0. Turn it all the way open: maximum current, a logical 1. A transistor is a faucet, and every computer you will ever use is billions of faucets opening and closing billions of times a second.
What makes this possible is a strange class of materials called semiconductors — crystals like silicon or germanium that are neither conductors nor insulators but can be persuaded, by “doping” them with minuscule traces of boron or phosphorus, to carry current via free electrons (n-type) or via the absence of electrons, called “holes” (p-type). Press n-type and p-type together and you get a junction through which current can be steered. Stack them right and you get a switch with no moving parts, no heat, no warm-up, no wear. It could last forever.
Silicon, and the world that followed
The first transistors went into hearing aids in 1952 — a courtesy to Alexander Graham Bell, whose wife had been deaf. In May 1954, at Texas Instruments, a Bell Labs veteran named Gordon Teal announced the first commercial silicon transistor; silicon tolerated heat that germanium could not. Five months later, TI partnered with a small Indianapolis company to release the Regency TR-1, the first pocket transistor radio — four germanium transistors, a 22.5-volt battery, twenty hours of life, $49.95, available in mandarin red, turquoise, and lime. Thomas Watson Jr. of IBM bought hundreds of them and handed them out to his engineers with orders to stop designing vacuum-tube machines. By 1957 IBM’s 608 was entirely transistorized. By 1958 the tube was finished in computing.
The tyranny of numbers
The trouble now was not the components but the wires between them. In June 1958, a Bell Labs executive named Jack Morton wrote that systems of thousands of discrete components suffered from what he called “the tyranny of numbers.” Every added transistor needed hand-soldered wires to every other relevant transistor. If each connection worked 99.9% of the time, a machine of a hundred thousand connections was essentially guaranteed to fail. ENIAC had five million solder joints. The transistor had solved one bottleneck and created another.
Two men, working in parallel and unaware of each other, saw the same solution in the same year. At Texas Instruments in Dallas, a lanky Kansan named Jack Kilby had joined the company in May 1958. TI had a mass-vacation policy; the whole plant shut for two weeks in July. Kilby, too new to have accrued any time off, found himself almost alone in an empty lab. In those quiet hallways he worked out that if transistors, resistors, and capacitors could all be carved from the same piece of semiconductor, the wires between them would simply vanish. On 12 September 1958, he connected a few components on a single slice of germanium with fine gold “flying wires,” flipped the switch, and watched a sine wave appear on the oscilloscope. The thing looked, in one reporter’s phrase, like a child’s failed art project. It was the first integrated circuit.
Four hundred miles of Texas plains and eight months away, at Fairchild Semiconductor on the California peninsula, a charismatic preacher’s son from Iowa named Robert Noyce was thinking along the same track. Fairchild had been founded in September 1957 by the “Traitorous Eight,” a group of brilliant young scientists who had walked out on William Shockley five days after his Nobel was announced. Shockley, running his Mountain View lab at 391 South San Antonio Road — since commemorated as the birthplace of Silicon Valley — had become unbearable: posting salaries publicly, demanding lie-detector tests after a secretary cut her finger on a thumbtack, steering the company toward a pet project he kept secret from his own staff. Eight of his best — Noyce, Gordon Moore, Jean Hoerni, Jay Last, Julius Blank, Victor Grinich, Eugene Kleiner, Sheldon Roberts — flew to San Francisco, met a young Harvard MBA named Arthur Rock in the Redwood Room at the Clift Hotel, and were told they should start their own company. Sherman Fairchild put up $1.38 million. Shockley called it a betrayal.
At Fairchild, Jean Hoerni invented the planar process — covering the silicon surface with a thin layer of silicon dioxide that acted as both insulation and mask. On 23 January 1959, Noyce sketched in his patent notebook how to use Hoerni’s oxide to build an entire circuit on one chip, with printed aluminum lines replacing Kilby’s flying wires. Silicon instead of germanium; printed interconnects instead of hand-soldered wires; mass-produceable. Noyce’s design is the one every chip in the world still uses today.
Courts eventually ruled Kilby and Noyce co-inventors. When Kilby won the Nobel in 2000, he used his lecture to say that Noyce, who had died of a heart attack in 1990, should have shared it. Noyce, in life, had shrugged the prospect off: “They don’t give Nobel Prizes for engineering or real work.”
Act IV — Integration and the law
On a modest Saturday in April 1965, Electronics magazine published, for its 35th-anniversary issue, a short essay by Fairchild’s director of R&D, Gordon Moore. He had plotted the number of components on the best chip you could buy for the lowest cost per transistor, year by year. Five data points. A straight line on semi-log paper. Moore predicted the line would continue — that the number of components on a chip would roughly double every year (he would revise it in 1975 to every two years). He titled it, deadpan, “Cramming more components onto integrated circuits,” and for good measure he forecast that integrated circuits would lead to “home computers — or at least terminals connected to a central computer — automatic controls for automobiles, and personal portable communications equipment.”
Moore later admitted: “I just did a wild extrapolation.” His Caltech colleague Carver Mead named it Moore’s Law. What started as a trendline became a commandment. The semiconductor industry built its entire planning apparatus — the International Technology Roadmap for Semiconductors — around hitting Moore’s cadence node by node, generation by generation. Every fab, every lithography tool maker, every dopant chemist in the world synchronized their watches to it. Moore’s law, alone among all laws of physics or economics, became a self-fulfilling prophecy: the world decided it was true and then made it true, for sixty years and counting.
How you print a world
The trick of hitting Moore’s line is a process that, in a different language, might be called photographic sorcery. A disc of single-crystal silicon, grown by dipping a seed into a vat of molten silicon and slowly pulling, is polished to mirror smoothness. Oxide is grown on top of it by heating. A light-sensitive polymer — photoresist — is spun onto that. A high-precision quartz plate called a mask, carrying the pattern of one layer of the chip, is held above the wafer; ultraviolet light shines through it. Where light hits the resist, the chemistry changes. Solvents wash the soluble parts away. The exposed oxide is etched off. Dopants are diffused into the bared silicon. Metal is evaporated onto the next pattern. And so on, sixty or a hundred layers deep, as the feature size shrinks from ten micrometers in 1971 to three nanometers today. This is photolithography — printing circuits the way Gutenberg printed books, but with light and atoms.
At the bleeding edge, visible light is too coarse. A single Dutch company, ASML, builds machines the size of a delivery van that generate ultraviolet at 13.5 nanometers by firing a CO2 laser fifty thousand times a second at falling droplets of molten tin, converting them into glowing plasma whose light bounces off the most precise mirrors ever made. MIT Technology Review calls ASML’s machine “the machine that saved Moore’s Law.” Without it, the phone in your pocket does not exist.
Silicon Valley by name
The place where all of this was happening got its name on 11 January 1971, when a journalist named Don Hoefler wrote a three-part front-page series in Electronic News titled “Silicon Valley USA.” Until then the area had been called the Valley of Heart’s Delight, because until the semiconductor industry arrived it was the world’s largest fruit-packing region, with thirty-nine canneries. Hoefler’s name stuck because it was just accurate: the revolutionary chips were being made in a valley, out of silicon. Hewlett-Packard had started the local habit in January 1939 in a Palo Alto garage. Stanford’s provost Fred Terman had been pushing his best engineering students to stay west for decades. Fairchild’s alumni — the “Fairchildren” — fanned out through the late 1960s founding more than seventy companies, among them AMD, National Semiconductor, Signetics, and Intel.
A chip called 4004
Noyce and Moore, chafing under Fairchild’s East Coast management, incorporated a new company on 18 July 1968 in Mountain View. They briefly considered calling it Moore Noyce before deciding that sounded too much like “more noise,” which in electronics is a bad thing. They settled on Intel — short for integrated electronics. Andy Grove, a Hungarian refugee who had escaped the 1956 Budapest uprising by crossing the Austrian border on foot, joined as employee number three. For its first decade Intel sold memory chips.
In April 1969, a Japanese calculator manufacturer named Busicom asked Intel to design a twelve-chip set for a printing calculator. The Intel engineers Ted Hoff and Stan Mazor counter-proposed a radically simpler idea: put a general-purpose programmable CPU on one chip, and let software do the rest. An Italian physicist named Federico Faggin, newly arrived from Fairchild, took over the silicon design. Busicom’s Masatoshi Shima wrote the firmware. On 15 November 1971, Intel announced the 4004 in a two-page spread in Electronic News: “Announcing a new era of integrated electronics.” The chip held 2,300 transistors on a die three millimeters wide, ran at 740 kilohertz, did 92,000 operations per second, and cost, at first, a few dollars. Faggin etched his initials, F.F., into the silicon. “I felt,” he said, “it was a true work of art.”
Ted Hoff’s summary is the one to keep: “We democratized the computer.” Before the 4004, a computer was a thing in a room. After it, a computer could be a thing in a calculator, a traffic light, a pinball machine, a blood analyzer, a cash register, a cow collar. A microprocessor could go, eventually, anywhere.
The 4004 led directly to the Intel 8080, to the Altair 8800 hobbyist kit whose cover on Popular Electronics in January 1975 caught the eye of Paul Allen and Bill Gates, and to the Homebrew Computer Club, whose 5 March 1975 meeting in a Menlo Park garage drew a 24-year-old Hewlett-Packard engineer named Steve Wozniak. “After my first meeting,” Wozniak wrote, “I started designing the computer that would later be known as the Apple I. It was that inspiring.” The line ran from there to the Apple II, the IBM PC, and the boxes that colonized every desk on Earth over the next fifteen years. The transistor count kept doubling. The 8086 of 1978 held 29,000 transistors; the 80386 of 1985 held 275,000; the original Pentium of 1993 held 3.1 million. Moore’s line held.
Act V — The architecture wars
As the number of transistors exploded, a question that had been easy to ignore began to matter very much. What were all those transistors supposed to do? What, exactly, is the logical anatomy of a computer — and how do you make one go fast?
The stored program, and who deserves the credit
The answer had been written down in pencil, on trains between Princeton and Los Alamos, in 1945. A Hungarian mathematician named John von Neumann — a polymath who worked on the atomic bomb by day — had struck up a conversation on an Aberdeen train platform in 1944 with an Army liaison officer named Herman Goldstine, who had mentioned a machine called ENIAC. Von Neumann visited, studied the project, and in the spring of 1945 wrote a 101-page document titled “First Draft of a Report on the EDVAC.” In it he described, in stripped-down abstract form, a machine in which instructions and data lived in the same memory — a single stored program that the machine fetched, decoded, and executed, one instruction at a time.
The core of it runs something like this: a program counter holds the address of the next instruction. The control unit fetches that instruction into the instruction register. It decodes the bit pattern into a command. The arithmetic logic unit executes the command on data pulled from registers — the CPU’s own tiny internal storage, measured in bytes, accessible in fractions of a nanosecond. The result is written back. The program counter advances. Repeat a billion times a second.
Goldstine mimeographed twenty-four copies of the First Draft and circulated them that summer. J. Presper Eckert and John Mauchly, who had actually built ENIAC and had already sketched the same ideas from the engineering side, were furious. The document amounted to a public disclosure that voided their patent rights. Eckert and Mauchly quit the Moore School. They spent the rest of their lives — and the 1973 Honeywell v. Sperry Rand lawsuit — fighting for credit that the world had already given to the man who wrote the paper. Today we call the design von Neumann architecture anyway. Virtually every general-purpose computer on Earth is one.
The pyramid of memory
The truth that shaped everything that followed is that memory is slow. Not in absolute terms — DRAM is astonishing in absolute terms — but compared to how fast a modern CPU can compute. A register responds in about 0.3 nanoseconds; main memory takes a hundred. If you scaled that to human time, touching a register would take a second, and reaching into RAM would take six minutes. Reaching to a spinning hard drive would take six months. Reaching across the internet would take a century.
Architects responded with a hierarchy. Right next to the ALU, a few dozen registers. Around them, a tiny, fast L1 cache of a few hundred kilobytes. Around that, a larger, slower L2. Around that, a still larger L3, shared among cores. Around everything else, main memory, and beyond memory the solid-state drives, and beyond them the hard disks, and beyond them the internet. Each tier gets bigger and slower by an order of magnitude. The reason this works is the principle of locality: programs tend to touch the same data again (temporal locality) and to touch data near what they just touched (spatial locality). Every speed-up trick in modern architecture is some variation of that insight.
Around the CPU, a bus system — a data bus, an address bus, a control bus — stitches the motherboard together. The buses of consumer PCs evolved from PCI at 133 MB/s in 1992 through PCI Express generations to PCIe 5.0 at 4 GB/s per lane. In AI accelerators NVIDIA’s NVLink 5 runs at 1.8 terabytes per second between GPUs — an order of magnitude beyond PCIe. The motherboard, in metaphor, is the central nervous system, and the buses are its nerves.
RISC against CISC
By the late 1970s the dominant instruction sets — especially Intel’s x86 and DEC’s VAX — had grown baroque. Instructions came in variable lengths, did complicated things, used dozens of addressing modes. A young Berkeley professor named David Patterson and a young Stanford professor named John Hennessy asked, independently, whether this was necessary. Their answer was RISC — Reduced Instruction Set Computing: a small, regular set of simple instructions, each executing in one cycle, with the complexity shifted from hardware to compilers. Berkeley’s RISC-II, built by graduate students, ran three times faster than its own predecessor. Patterson’s quote on what was happening: “There is this remarkable point in time when it was clear that a handful of grad students at Berkeley or Stanford could build a microprocessor that was arguably better than what industry could build.” Patterson and Hennessy won the 2017 Turing Award. By then, 99% of the 16 billion microprocessors produced every year were RISC.
x86 survived in PCs because IBM had chosen it for the 1981 PC and software inertia was enormous. But the philosophical victory went to RISC: modern x86 chips have, since 1995, secretly translated their baroque instructions into internal RISC-like micro-ops. And in the world that came to matter more than PCs — mobile phones — a British RISC design called ARM swept everything else aside, because its simple decoder used less silicon and less power, and every joule counts when you run on a battery.
The gigahertz wall
Through the 1990s a beautiful thing called Dennard scaling gave the industry free speed. Every node shrink, every couple of years, cut transistor size by 30%, cut voltage by 30%, and let the clock rate rise 40%. Power density stayed constant. Speed went up for free. Consumer clocks climbed from Apple II’s 1 MHz in 1977 to Pentium’s 60 MHz in 1993 to the magical 1 GHz, which AMD hit first with the Athlon in March 2000, one step ahead of Intel. Intel’s NetBurst architecture — first in the Pentium 4 — was designed to scale all the way to 10 GHz.
It never got there. Somewhere between 2004 and 2006, Dennard scaling collapsed. Voltage could not keep dropping because leakage current began to dominate — electrons tunneling through ever-thinner oxide, static power growing exponentially as threshold voltage fell. Power scales as capacitance times voltage squared times frequency; when voltage flattens and frequency rises, heat rises with it. The 2004 Pentium 4 “Prescott” ran so hot that reviewers roasted it for the 115 watts it dissipated. Intel publicly canceled its successor, Tejas, on 7 May 2004 — early silicon at 2.8 GHz was already pulling 150 watts. The line that had carried semiconductors for thirty years ran straight into a thermal wall.
Intel panicked. Salvation came from an unexpected place. The company’s Haifa, Israel R&D center had been quietly working on a lower-clocked, lower-power chip code-named Banias for laptops. Intel’s corporate dogma — more gigahertz — had nearly killed it. The Israeli team lead, Mooly Eden, later described how they saved it: “We did it the Israeli way; we argued our case to death. You know what an exchange of opinions is in Israel? You come to the meeting with your opinion, and you leave with mine.” Banias became the Pentium M (Centrino); its descendants became the Intel Core family. On 27 July 2006, Intel launched the Core 2 Duo, with two cores on one die at a slower clock. Paul Otellini, the new CEO, called it “not just incremental change; it’s a generational leap.” AMD had actually beaten Intel to dual-core with the Athlon 64 X2 in 2005. Either way, the direction had changed.
The answer to the gigahertz wall was not faster cores but more cores. Parallelism. The fourth runner had handed off. The baton was now sprinting sideways.
Act VI — Parallelism and the GPU era
The trouble with parallelism is that most problems are not easy to parallelize. In 1967 a Stanford engineer named Gene Amdahl proved the painful ceiling: if 10% of a program is inherently serial, no number of cores can speed it up more than tenfold. Race conditions, deadlocks, cache coherency, and the programmer’s own confused brain stood in the way of more cores equalling more speed. Through the late 2000s, single-thread performance crept up by 5% or 10% a year while transistor counts kept doubling. The extra transistors had to go somewhere. They went into caches, into more cores, into vector units, and — most consequentially — into a strange and increasingly important kind of chip that had, until 2007, been mostly used to render video game graphics.
The accidental supercomputer
NVIDIA was founded on 5 April 1993 at a Denny’s on Berryessa Road in east San Jose, by three engineers with forty thousand dollars and no plan. “I had no idea how to do it,” Jensen Huang recalled, “nor did they. None of us knew how to do anything.” They named the company after the Latin word for envy. For six years NVIDIA made graphics cards for PC games. In October 1999 they released the GeForce 256, which they marketed, with the audacity of a company learning how to narrate itself, as “the world’s first GPU.” The acronym stuck.
Graphics rendering is an unusual problem. To paint a frame you must execute the same small shader program on every one of two million pixels, independently. It is what computer scientists call embarrassingly parallel — not a little parallel, not somewhat parallel, embarrassingly parallel. A GPU is built, therefore, as a chip of thousands of very simple cores doing the same thing in lockstep on different data. It trades the sophistication of a CPU for the throughput of a factory.
Sometime in the early 2000s researchers began to notice that graphics cards, once their shaders became programmable, could do more than shade. They could do physics. They could do fluid simulation. They could do, in principle, anything that could be expressed as the same operation on many pieces of data at once. A Stanford graduate student named Ian Buck built a stream-processing language for GPUs called Brook. Jensen Huang hired him.
Huang made the kind of bet that makes or breaks a company. He committed NVIDIA to building CUDA — a general-purpose programming layer that let scientists write ordinary C code and run it on a GPU. CUDA 1.0 shipped in November 2006 alongside the GeForce 8800 GTX. For the next seven years almost no one cared. From 2009 to 2015 NVIDIA’s market capitalization bounced around $10 billion. Huang kept pouring money in — roughly $12 billion of R&D over a decade — while Wall Street asked what the point was. Huang later called it “the first strategic decision that came closest to an existential threat.”
The bedroom that changed everything
The answer came on 30 September 2012. A PhD student named Alex Krizhevsky, working in his bedroom at his parents’ house with his advisor Geoffrey Hinton and his colleague Ilya Sutskever at the University of Toronto, submitted a convolutional neural network to the annual ImageNet image-recognition contest. The network was trained on 1.2 million images over five to six days on two NVIDIA GTX 580 gaming cards, each costing $500. Krizhevsky’s team — calling itself SuperVision — scored a top-5 error rate of 15.3%. The second-place team scored 26.2%. In a field that usually measured progress in tenths of a percentage point, SuperVision had won by eleven points. Yann LeCun called it “an unequivocal turning point in the history of computer vision.” Hinton later quipped: “Ilya thought we should do it, Alex made it work, and I got the Nobel Prize.”
The reason AlexNet worked, and the reason every deep neural network since has worked, is that the core mathematical operation of neural nets is matrix multiplication — vast grids of independent multiply-and-add operations. That is exactly what a GPU is shaped to do. Fei-Fei Li, who had built ImageNet in the first place, put it simply: three things converged that September — a massive labeled dataset, a deep convolutional algorithm, and GPU compute.
The decade of deep learning began that afternoon.
For NVIDIA, the consequences were biblical. Volta (2017) introduced dedicated Tensor Cores that did 4×4 matrix multiply-accumulates in a single instruction. Hopper (H100, 2022) packed 80 billion transistors, ran at 4 petaflops of FP8, and sold for $25,000 to $30,000 to anyone who could get one — Larry Ellison told a 2023 dinner audience that he and Elon Musk had spent “an hour of sushi and begging” with Huang to secure an allocation. Blackwell (2024) crossed 208 billion transistors across two fused reticle-limit dies, 20 petaflops of FP4, and 1,000 watts per package. NVIDIA passed a $1 trillion market capitalization in May 2023, $3 trillion in June 2024, and in late October 2025 briefly touched $5 trillion, becoming for a while the most valuable company on Earth. Over 90% of AI training in 2026 runs on NVIDIA hardware. Huang’s unofficial corporate motto, repeated internally for two decades, is “Our company is thirty days from going out of business.”
The zoo of modern silicon
Today’s computer is no longer a computer. It is a small city of cooperating silicon cities. A modern laptop contains several different kinds of chip, each specialized for a different shape of work.
CPUs still do what CPUs have always done — a handful of powerful, complex, out-of-order cores that chew through serial control flow and branching logic. GPUs do the embarrassingly parallel work: graphics, physics, neural nets. NPUs and TPUs — like Google’s Tensor Processing Unit, built in fifteen months in 2014–15 under the engineer Norm Jouppi, a 65,536-element matrix multiplier on a single chip that Google claimed was 15 to 30 times faster and 30 to 80 times more power-efficient than contemporary CPUs and GPUs — specialize further still, doing only the specific math that runs neural networks. Apple’s Neural Engine, Qualcomm’s Hexagon, Amazon’s Trainium, Microsoft’s Maia all play the same game.
The most interesting chip of the last five years may be Apple’s M1, unveiled 10 November 2020. It held 16 billion transistors on a 5-nanometer TSMC process. Its innovation was not any single component but the arrangement: CPU cores, GPU cores, Neural Engine, and memory all sat on the same package, sharing a single pool of unified memory with no copying between them — Apple’s Unified Memory Architecture. The M1 ran twice to three times cooler than the Intel chips it replaced. Within months it was outselling Intel Macs. By 2025 the M3 Ultra held 184 billion transistors in a consumer laptop-class chip. Gordon Moore’s 1965 line held.
The road ahead
The road runs out. At 2 nanometers and below, transistor gates are only a few atoms wide; single atoms of dopant matter. Physics is closing in. Four bets are currently running to keep the race going.
Three-dimensional stacking puts layers of silicon on top of each other instead of beside each other. AMD’s 3D V-Cache, introduced in 2022, bonds a 64-megabyte cache directly atop a CPU die using TSMC’s SoIC hybrid copper bonding — no microbumps, just copper-to-copper — with over 200 times the interconnect density of 2D packaging. TSMC’s CoWoS packaging, which places multiple dies on a shared silicon interposer, is what makes modern NVIDIA AI chips possible, and its limited worldwide capacity is why H100s and B200s have been on allocation for three years.
Neuromorphic chips imitate the brain’s architecture directly. IBM’s TrueNorth (2014) built a million “neurons” and 256 million “synapses” onto 5.4 billion transistors drawing only 70 milliwatts. Intel’s Loihi pursues spiking neural networks that fire only when signal crosses a threshold, reporting energy savings of one to two orders of magnitude on certain tasks. Neuromorphic chips remain a research field, not a business — but they point at a design philosophy in which computation follows biology rather than Boolean algebra.
Quantum computing abandons bits altogether. A qubit can be 0 and 1 simultaneously — a mathematical state called superposition — and qubits can entangle so that measuring one instantly determines the other. For a narrow class of problems — factoring large numbers, simulating molecules, certain kinds of search — quantum machines could in principle outrun any classical computer exponentially. On 23 October 2019, Google’s Sycamore, a 54-qubit superconducting processor, completed in 200 seconds a specific random-sampling task that Google claimed would take a classical supercomputer 10,000 years. IBM promptly rebutted with 2.5 days. The term “quantum supremacy” remains disputed. Today’s qubits decohere in microseconds; error-correction schemes require roughly a thousand physical qubits to make one logical one. IBM’s Condor processor crossed 1,121 qubits in 2023. The useful quantum computer remains, as it has for thirty years, ten or twenty years away.
And photonic computing, which runs calculations at the speed of light through silicon waveguides, and in-memory computing, which does arithmetic where the data lives, and carbon nanotube transistors, which replace silicon with molecular wires, all sit in laboratories waiting for their moment.
Epilogue: the relay race seen whole
Stand back and the century and a half from Babbage’s outburst to Huang’s Blackwell looks like a single motion, even though nobody inside it ever saw the whole. The gears of Jacquard and Babbage hit a wall of friction, mass, and precision, and the relay passed to electricity. Relays hit a wall of speed, and the vacuum tube took off. Tubes hit a wall of heat and failure, and Bardeen, Brattain, and Shockley handed the baton to the transistor. The transistor hit the tyranny of its own wires, and Kilby and Noyce stitched them together on silicon. Silicon hit the limits of hand-drawn circuits, and photolithography taught us to print them. Printed transistors hit the gigahertz wall at Prescott, and parallelism picked the race up at Core 2 Duo. Serial parallelism hit Amdahl’s ceiling, and the GPU — a chip designed to paint video games — turned out to be the machine that could teach computers to see. Every leap made the next one necessary. Every solution created the next bottleneck.
What is striking is how human the story is. A teenage Pascal for his father. A mother trying to exorcise Byron from her daughter. A conductor punching holes in a train ticket. A Kansan alone in an empty lab on his coworkers’ vacation. Eight scientists signing a dollar bill. Six women working from blueprints because no one would give them clearance. A Hungarian refugee running operations. An Italian etching his initials into silicon. Three men in a Denny’s, betting on envy. A graduate student in his parents’ bedroom, training a neural net on a gaming card. Each of them solving the problem the last generation left behind.
The computer is alien when you meet it finished, in your pocket, humming at three gigahertz behind a sheet of glass. It is not alien at all when you meet it unfinished, in a Lyon silk loom or a Bletchley hut or a Dallas lab in July. It is what we have always been building: a thing to do the sums for us, so we could think about something else.
The baton is still in the air.