Knuth: It’s not only fun. The job of a mathematician is to make proofs but almost never, when you’re solving a mathematical problem, do you find a theorem for which the hypotheses are exactly what you need for the problem you’re solving. Almost always you’ve got something that’s sort of like the theorem that’s in the book. So what you do is you look at the proof of that theorem and you say, “Oh, here’s how I have to change that proof in order to prove the hypothesis that I really have.” So mathematical books are packed with theorems, but you never plug in exactly the theorem—you want to see that proof because it’s one time in a hundred when you’ll find just the theorem that you wanted. I think it’s exactly the same with software.
Seibeclass="underline" Yet isn’t software—I think you’ve said it yourself—about the most complex thing human beings have ever made?
Knuth: It was Dijkstra, I think, first, but yeah. It’s because the complexity of putting something together is so nonuniform. Pure mathematics tends to have a few rules that are applied universally—three or four axioms that describe a system. Computer programs have many parts—step one is different from step two is different from step three. It brings together all these things and they have to fit in an intricate way.
Seibeclass="underline" So given that complexity, it seems like at some point you have to take some black boxes and say, “OK, we know how this thing works, and we can use it.” If we were forced to look inside every black box, we’d never finish.
Knuth: I’m not saying black boxes are useless. I’m saying that if I’m not allowed to open ’em up—if I have to do everything with a library or something like this, I would come up with much, much worse results and much slower.
Seibeclass="underline" Slower-running or slower-developing?
Knuth: Both. Well, OK, I can get programs to work in a hurry, so I can’t claim that. It’s just taking me longer because I have to search through more reference manuals and find the right parts, so it’s more of a search problem than a creative problem.
Seibeclass="underline" A while back the guy who wrote the standard Java collection libraries wrote an article about how there had been a bug in their implementation of binary search for nine years. Basically they took the min and the max and added them together and divided by two. But, of course, if that addition overflows, that’s a bug. So, it’s bad the standard library had a bug in it, but they found it eventually and fixed it. If everyone wrote binary search themselves, the percentage of binary searches that would be wrong would probably be quite high.
Knuth: That’s true. And that’s just one of a huge number of examples. Binary search is a particular example that we started out our programming classes in the ’70s. The first day of class everyone writes a program for binary search and we collect them and the TAs take a look. And you find that fewer than ten percent are correct. And there are four or six different bugs. But not with respect to the overflow that you mentioned, which is a new one—it didn’t occur in my classes that we thought adding these two numbers might be a problem.
But this black-box idea—why do I hate it so much? We’re talking arithmetic, so say you have a program for matrix multiplication. You have a matrix multiply black box and then you change the data type from real to complex and so then you’ve got a complex matrix multiply box and it takes 4xn3 steps instead of n3 steps. If the real case takes a certain time, t, then the complex case takes time 4t. But if you’re allowed to open the box, then you’ll only need three real matrix multiplications instead of four because there’s an identity that will describe the product of two complex matrices. That’s just one small example—it just goes on and on.
I’ll have a priority queue or I’ll have some kind of a heap structure; whatever it is—binary search—I’ll have a good source for the algorithm but it won’t quite fit what I want, so I’ll adapt it every time. And I think I’ll be better off. I know I’m going against a lot of people who think their job is to write things that everybody is going to use, so if there are any bugs in it, they get to fix ’em and everybody else’s program will start working better now. OK. I’m unhappy with that kind of world. I like to see their program.
When I wrote Volume I of The Art of Computer Programming people didn’t realize that they could use linked lists in their own programs, that they could use pointers for data structures.
If you had a problem that had something beyond arrays, you would go to somebody’s package, or an interpreted language like IPL-V or Lisp. There were also Fortran versions and you could get these subroutines and then you would learn how to use that package and you would do your program in that. It was completely preposterous for somebody to teach an ordinary programmer how to include linked lists in their program. Everything was supposed to be done by these canned routines.
But general packages have got to have all of this extra machinery in order to handle cases that only come up for a small number of the users. So my book sort of opened people’s eyes: “Oh my gosh, I can understand this and adapt it so I can have elements that are in two lists at once. I can change the data structure.” It became something that could be mainstream instead of just enclosed in these packages.
Well, I’m seeing the same thing now that I’m writing about BDD structures. At the moment there are three or four packages of subroutines that work with BDDs but the thrust of what I’m writing now for The Art of Computer Programming is that you too can program simple versions of BDDs for lots of applications and it’ll go very far. You can use them for lots of different kinds of problems where you don’t need all the bells and whistles of these other packages and these things you can do are easy to understand and easy to put into programs you’re writing yourself.
Earlier this year I finished my section on bitwise tricks and techniques and these are things that have been a black art in the hacker community for years. I decided it’s time to say that there’s a theory of these things that you can understand the ideas, how they fit together. You can use them yourself with confidence. And you can build on things and do amazing things that last year you didn’t know how to do well at all. It went through the underground until now; it’s something that we might as well teach to people—something that deserves to be common knowledge.
I write a lot of programs and I can’t claim to be typical but I can claim that I get a lot of them working for a large variety of things and I would find it harder if I had to spend all my time learning how to use somebody else’s routines. It’s much easier for me to learn a few basic concepts and then reuse code by text-editing the code that previously worked.
Seibeclass="underline" How has the way you think about programming changed from those earliest days to now?
Knuth: We already talked about literate programming—that’s a radical departure, that I’m viewing myself as an expositor rather than trying to just put together the right instructions. Dijkstra came out with that same evolution. In the end his programs were even more literate than mine in the sense that they didn’t even go into the machine. They were only literate.
And he was one of the people largely responsible for structured programming, where we saw patterns that we could use to scale up our program so that we could make larger programs and still keep our head straight. You write a program that’s ten times as big but you don’t have to lose ten times as much sleep over it because you have tools that allow you to put things together reliably in a larger system. That was definitely different.
So that’s one important aspect, the idea of the abstractions that we have to understand, the abstractions that allow us to deal with large systems and still be pretty confident that we’re in control and we know what we’re doing, even though they’re a mind-bogglingly complex things.