The Greatest Computer Ever Made
Let’s just take a moment to visualize something. In 1969 — yes that’s right, nearly 50 years ago — people walked on the moon! It seems to be an obvious piece of history today, but that doesn’t take away from the wonder.
That’s the magic of the Apollo program. In what was perhaps the most awesome series of events in generations, NASA managed to get 12 people to put their feet on the moon, and it was by no means any small feat. The Apollo missions required groundbreaking engineering innovations, and above all some sheer guts. There are so many different aspects to handle when it comes to planning a mission like this. After all, they don’t use the term “rocket science” without reason!
One particular aspect of Apollo that I’ve always wondered about is the part where they actually get to the moon. The most crucial component in this process is the flight computer, which gives the astronauts a real time overview about the spacecraft, and has programmed routines to make their lives easier. This machine, as you’ll see soon, was a milestone for instrumentation, control and software engineering.
It was completely designed, built and programmed at MIT’s Instrumentation Lab. That’s right, not a corporate company. Even giants like IBM wanted to take up the project, but NASA placed their faith in this university. Well, MIT isn’t just some ordinary university though, and their innovations would change the world.
The Computer
Let’s get to the star of our show, the messiah, the Apollo Guidance Computer. Why do I evangelize, you ask? To understand how heavenly the software was, let’s first examine how hellish the hardware that they had to work with was.
Here’s the computer itself! Unimpressed? At the time, a computer was the size of a room, and sometimes an entire building. This one had to fit in a tiny package on board the command module, and its size imposed severe restrictions on the hardware.
How much RAM does your computer have? Eight gigabytes? Four perhaps? The AGC had kilobytes . Just two kilobytes. With only this tiny space they had to work with, programs needed to be incredibly terse and efficient.
Apart from the 2Kb of erasable RAM, the programs themselves had to be stored in only 36Kb of read-only memory, or ROM. Also, since these components would be enduring harsh accelerations and forces on the rocket, they needed to develop memory that wouldn’t fail at any cost. The solution was Core Rope Memory (CRM) , shown in the image. Ones and Zeroes were literally weaved into a board on a loom by textile workers! Gives a whole new meaning to storing strings, eh?
The processor ran on *integrated circuits* , and this first time ever that ICs would be used in a computer. In fact, the technology was so new, the production engineers didn’t even know how to test them! Every single step of this computer’s hardware was a gamble, a story of clever trial and error by some very smart people.
The Problem(s)
When you attempt a project this ambitious, there’s bound to be problems. A lot of them. The AGC was built at a time when the concept of software itself was a little bit fuzzy. Software was simply part of the process of developing the hardware, and the MIT researchers were expected to program the thing as well. On this project however, the software was by no means trivial. It introduced engineering problems in a field that wasn’t even considered real engineering.
One problem of particular interest is the issue of handling multi-programming. Clearly, the AGC had a lot of different tasks to do at the same time, like calculate trajectories, run maintenance programs, display information to the user, etc. How would it handle all of these at once, especially with the pathetically low amount of memory it had?
Traditional computers at the time used a system called round-robin scheduling . When there were multiple tasks to be done, the processor handles each one for a specific slot of time, and keeps cycling between them. For 3 processes P1, P2, and P3, the computer might handle P1 for 10 microseconds, P2 for 10 microseconds, P3 for 10 more, and then go back to P1.
This method has its problems. For one, if there comes along a short program, then you’re wasting the remainder of the cycle. If there’s a long or important program that’s demanding a lot of computing power, you aren’t giving it enough attention. Round-robin can work for large server computers that might be working with repetitive tasks, but for a real-time system that must be responsive for the user, it fails.
The Solution
The engineers realized that they needed a method for the processor to complete the important jobs first, and know which jobs to keep for later — a notion of priority . The tasks should also be processed one by one, so that you finish the important tasks as soon as possible — a run queue .
The solution is to assign each process a particular priority, and keep moving the higher priority tasks to the top of your queue. This priority-queue system ushered the age of modern multi-programming and is still used in commercial operating systems.
Apollo had two separate job queues — The Waitlist and the Executive. Programs would move from the Waitlist to the Executive once the higher priority processes were executed.
When a higher priority process comes along, it is moved straight to the top of the Executive, in what is called an interrupt , as the executing process is interrupted to make way for the more important process. The AGC also had a 12-word erasable area called the Core Set, where it stored information about the programs that are executing. Hence, interrupted processes can be resumed where they were left off. Clever!
The Test
Priority-interrupt scheduling showed it’s true power during Apollo 11, after the Lunar Module initiated its descent towards the Moon. When the spacecraft was at 30,000 feet above the Lunar surface, while the descent program was executing, the computer ran a warning alarm with the code 1202. In what has to be called the most important tech support call of all time, engineers Steve Bales at Mission Control in Houston discovered that the code meant “Executive Overflow — No Core Sets”, implying that the computer was handing more than it could take.
Think about how much work the system must have been doing at the time — It had to know where the lunar module was and where it was moving, information called state vector. It needed to maintain the right attitude based on that position, as well as velocity, altitude, and engine performance data. It also needed to adjust the abort trajectory constantly, ready to get the crew back into orbit should something force an abort. All this consumed nearly 90% of the CPU’s computing time.
Houston identified that the overflow could be due to pilot Buzz Aldrin leaving the external radar on in passive mode (called SLEW mode), and that might be still be eating some of the computer’s precious clock cycles. Sure enough, once the radar was turned off, the alarms ceased and the computer was stable once again.
Without a priority-interrupt system, the system would have crashed when the program gave it a new job, and would need a hard reboot. Instead, the scheduler simply chucked out some lower priority tasks and sounded the alarm, keeping the system alive. Without the scheduler, the moon landing would never have happened!
The Greatest Computer Ever Made
I only talked about one small part of the AGC here, but it was truly a revolutionary machine in every way. As a programmer whose programs regularly leak memory, it baffles me how the software was *that* well optimized, to be able to do so much with so less. This wasn’t just a computer that had to work pretty well, it was a computer that would kill people if it failed. Perfection requires effort, enormous amounts of effort. The work was so demanding, 15 MIT engineers got divorced while working on the project! The years of passionate work building the AGC speaks in the stellar success story of the Apollo program.
I think it’s safe to say that this is indeed the greatest computer ever made, and will be for ages to come.
Notes
I had a lot of fun writing this article, and I learned a lot as well!
NASA has some great material explaining the architecture of the AGC, and the code has been painstakingly digitized and is available on GitHub.
Sources -
If you’d like to read more, I wrote an article about Apollo’s guidance instrumentation over on the Nakshatra blog. Do check it out!