半渡的博客

n5321 | 2026年2月27日 12:13

Tags: AI

Teaser / Intro

Leslie Lamport: If you think you know something but don't write it down, you only think you know it.

Host: This is Leslie Lamport. He's a Turing Award winner famous for his contributions to distributed systems. And I interviewed him for the stories behind his papers.

Leslie Lamport: Their reaction shocked me. They became angry. I really thought they might physically attack me.

Host: What was it about Dystra's old solution that you felt was unsatisfactory? It was not an obvious idea to most people that had actually impressed Dystra.

Host: As the inventor of the Paxus algorithm, I asked them his thoughts on the competing raft algorithm.

Leslie Lamport: There was a bug discovered in Raft and fixed, but I believe the algorithm that they found more understandable was one with that bug.

Host: I also enjoyed reflecting over his 50-year career. You say things like, "You never considered yourself smart. How could that be?" Stupid people think they're smart because they're too stupid to realize they're not.

Host: You felt like a failure at some point because you wanted to develop this grand theory of concurrency and you never discovered it. Do you still feel that way?

The Bakery Algorithm & Dijkstra

Host: Here's the full episode. I wanted to start with the bakery algorithm. What is the problem that the bakery algorithm solves? And you know how did you discover the problem?

Leslie Lamport: Well uh the problem was invented or discovered by Edkar Dystra in a 1965 I think it was 1965 paper and that began I consider that really the beginning of the theory of uh concurrency concurrent programming. He was the first one who really made use of the idea of of concurrency as a way of structuring programs as a as a collection of semi-independent tasks and the processes have to uh synchronize with one another.

Uh one of the processes or you know among the processes would be well this was in the days of time sharing uh you know right at really at the beginning of beginnings of time sharing and the idea of multiple people using the same computer. People realized that computers were worked faster than humans and so and computers were very expensive in those days. So uh they wanted they could use a computer to simultaneously to be used simultaneously by multiple people. The program that each user was running you know was a separate program but sometimes you know there were resources that got shared. for example, a printer, two people trying to print on the same printer at the same time. Well, the result would be, you know, not very satisfactory.

So uh he realized there was this problem of uh synchronizing um multiple processes via the idea of what he called a critical section or some piece of code in each of the processes so that at most one process can be executing that piece of code at any particular time. So that code might be the code that prints something on the printer. So the problem was how to get the uh processes to synchronize among themselves so that at most one process was executing its critical section at a time.

And um it was in 1972 that I learned about the problem because there was an article giving a solution to it um in the CACM communications of the ACM and uh I mean I used to program and I liked little programming problems you know uh and this was just a very nice little programming problem. And so I looked at the solution, which is fairly complicated, and I said, "Oh, gee, that shouldn't be so hard." And so I whipped off a very simple uh algorithm for two processes and submitted it to CACM. And a couple of weeks later, I received uh a letter from the editor uh pointing out the bug in my program. So that had two effects.

The first was that I realized that concurrent programs were hard to get right and that you needed a proof that they were correct. And second uh was that made me feel I'm gonna solve that damn problem. And I came up with the bakery algorithm which was inspired by uh the idea came from you know what now called the deli problem where you have a deli counter and that collects you know tickets a roll of tickets and every customer would come in and take a ticket and then the the next person s to to be served would be the one with the highest the lowest numbered ticket uh that hadn't been served yet.

And basically that I took that idea uh but since uh there was no central server uh or at least the the problem is as specified by Dystra involved no central control. Each process basically had to choose their own ticket. That was you know the basic idea and the algorithm was you know quite simple. And I wrote a a proof of correctness.

And uh the proof of correctness revealed to me that this algorithm was had this very interesting property. There was a general feeling in fact somebody published in a in a book or paper saying you know that it was impossible to implement mutual exclusion like this without using some lower level mutual exclusion. And the way most the mutual exclusion that was assumed generally was that of shared registers. you know, shared pieces of memory that could be written and write and uh read by different processes. And the idea is that, you know, you couldn't have one process, you know, two processes writing at the same time or one process reading while the other process was writing. People assumed that those actions were atomic. They always performed as if they occurred in some specific order.

But the amazing thing about the bakery algorithm was that it didn't require that assumption. It it used uh each shared memory a piece of memory was only written by a single process. So it didn't have to worry about two processes interfering with each other. The only problem that you might come is that somebody reading the uh value while it was being written might get you know some unknown value but the algorithm worked anyway. If somebody read if one process read while the registers was being written that process reading process could get absolutely any value and the algorithm still worked.

Host: I saw in your your writing about this problem that you shared it with a colleague named Anatol Hol and the proof was so remarkable that uh they didn't believe it and...

Leslie Lamport: well the the result was so remarkable.

Host: Yes. Yes. that didn't believe it.

Leslie Lamport: And uh you know I wrote the proof on the on the whiteboard for him and you know he couldn't find it but he went home and saying there must be something wrong with it and uh he obviously never found anything wrong with it.

Host: Right. I saw the name of the paper is a new solution of Dystra's concurrent programming problem. What was it about Dystra's old solution that you felt was unsatisfactory and made you want to solve this problem?

Leslie Lamport: Uh well, there was an unsatisfactory aspect of his original solution that had the property that if there were a lot of if processes kept trying to uh enter their critical section uh an individual process might be starved. might never get access to to the critical section that was solved uh by you know the next solution I think was Don Kuth's the condition that was desired or that that measured that what was considered the uh the efficiency of it was how long a process might have to wait and I believe that the bakery algorith of them was the first one that was really first come first served. That is if one process came if what it meant is if one process chose its number before another process tried to enter the first process would enter the critical section before the other process did. And I believe the bakery algorithm was the first one with that uh with that property. And also I think it was simpler than uh other uh solutions.

Working with Dijkstra & The Gift of Abstraction

Host: in a lot of the writing. I see that you worked with Dystra and I saw in 1976 you actually worked for a month in the Netherlands and you worked with them. Can you talk about that a little bit?

Leslie Lamport: Dyster used to had the things they're called EWDs his initials. little papers, things that when he he thought of something, had some idea, he would write it down and send it out to people. Well, one of those EWDs was about he and uh some associates or actually sort of mentees I guess you would call them wrote this this algorithm. It was the first concurrent garbage collection algorithm.

a way of writing programs evolved where there was a pool of memory uh that when a program would need a piece of memory, it would ask some server for it and be given this piece of memory. Uh but at some point it would stop using that memory. But the program itself wouldn't know that the one the particular process that created this memory you know wouldn't know whether some other process is using that memory or not. So there was an additional process called the garbage collector which would go around examining the memory and decide which pieces of memory were no longer being used and then put them back on the it's called the free list and in which uh the uh server that the process that was giving out the uh uh memory would be able to to take it.

I looked at it and I realized that uh I could simplify the algorithm. Uh because he had some some spe the the handling of the free list was done by a special process that you know which had it to worry about its own coordination with the uh uh the processes that were using the memory. And I realized that that free list could just be made part of the regular data structure. uh so it didn't need special handling and that seemed to me like a very uh simple idea and a very obvious idea and I sent it to him and then when I get got the next version of the paper I discovered he had made me an author and I thought that was very generous of him uh to have to have done that because it seemed like very simple idea and I mean very obvious vious idea and I later realized much later that it was not an obvious idea to most people uh and that that had actually impressed uh Dystra.

when that was the only thing I actually did with Dystra many years later he said that I had uh a remarkable ability at abstraction only in very recent years I mean Maybe maybe after I got the touring award that I realized that the reason for my success and the reason I got it wind up wound up getting a touring award was not that I was particularly that smart but that I had this gift of abstraction and Dystra was smart enough to realize that I was invited to uh spend a month uh but not with Dysterra, with a colleague of his, Carl Carl Holton. Only one thing that was ever published came out of that. Carl and I would uh meet with Dystra once a week. Uh in the in the course of that discussion, the idea somehow came up that led to uh a variant of the bakery algorithm that I wrote up and published. Uh so that was the the one tangible result that that came from my month in uh the Netherlands.

Host: Yeah. I I saw that you wrote that. Yeah. You you spent one afternoon a week working, talking and drinking beer at Dextrous House and you kind of don't remember exactly who was uh you know in charge of uh what on that paper, but...

Leslie Lamport: yeah. Well, I don't think I was really could have gotten that drunk because uh I probably drove to the meeting and back from the meeting. So,

Host: Right. Right.

Leslie Lamport: The the Dutch beer that I was drinking was not very alcoholic.

Time Clocks and Ordering of Events

Host: I wanted to talk about your most cited paper, the one titled time clocks and the ordering of events and distributed systems. What's the story behind the paper and the problem you were solving with it?

Leslie Lamport: The origin was simple. uh it well somebody sent me a paper on building distributed databases and so where you'll have well multiple copies of the data in different places and you need to keep them synchronized in some way. I looked at it and I realized that their solution had this problem that the se that it it had the property that things would be executed as if they occurred in subsequence but that sequence could be different from the sequence in which they actually happened.

The notion of of what you know happening before means is not obvious or not obvious to most people but I happen to you know learn about you know special relativity in particular uh what's known as the it's the space-time view of of special relativity where you basically consider space and time together just one four-dimensional thing and that was Einstein wrote his paper in 1905 and and in I think it was 1909 uh somebody whose name I'm blocking on provided this four-dimensional view and that four-dimensional view has the the particular notion of what it means for one pro one event to occur before another and that notion is that one event happens before another. If a a signal uh was emitted from the first event and received by the whoever did that second event before that second event happened, but the communication could not travel faster than the speed of light because nothing can travel faster than the speed of light.

Well, I realized there was an obvious analogy. Uh the notion of happens before is exactly the same as in relativity except instead of being whether something one event can influence another by things traveling at the speed of light, it's whether the first event could have affected the other by information sent over messages that were actually sent in the system. The thing that you know blew people away was this this definition of of happens before in a distributed system with also this was the first paper I would call like you know had a scientific result about distributed systems.

I made perhaps you know mistake that I was warned against at some point of having two ideas in in one paper. The other thing that I realized was that there was an an algorithm that would show whether one event that it would produce an ordering that satisfied this that condition that if some if one event happened the other before the other then that first event would be ordered before the other. And I realized that if you had an algorithm to do that, you could use it to basically provide the the synchronization you needed for any distributed system because you could describe that system in terms of a state machine.

And a state machine as as I described it then is something that has a state and process executes you know uh commands that need to be executed in order and the command simply is something that makes a change of the state and and produces a value. And so you can just describe this state machine as just you know how event how commands affect the state and and how they produce and what you know what the new state is as a function of the original state and what the value is as a function of the original state.

It it turned out that this was very obvious to me, but that's really in practice the important idea in that paper because it showed that this method of building distributed systems by thinking in terms of state machine and and can thinking about concurrent systems in terms of state machines. Um but that part was completely ignored. As a matter of fact, twice I talked to people about that paper and they said there was nothing in that paper about state machines and I had to go back and reread and reread the paper to be sure I wasn't going crazy and it really did talk about state machines.

It's important uh for another reason. uh if you're trying to understand a concurrent program you concurrent programs are written the bakery algorithm is really an exception uh concurrent programs are written assuming atomic actions so that you assume that the execution behaves like a a sequence you you can assume that the execution proceeds as a sequence of events. It turns out that the way to understand, you know, why why does a program produce the right answer? Well, the answer is well, you it you give it the uh the you know the right input. You give it the input and then it produces the right answer. Well, but by the time you're in the middle of execution, what it was given at the beginning is ancient history. The only thing that that tells the program what to do next is its current state.

And the way to understand uh a program, you know, a simple program that just, you know, takes input and produces an answer is to say what is the property of the state at each point that ensures that the answer it produces is correct is going to be correct. And that property which is mathematically a fun a boolean valued function of the state is called an invariant. And understanding the invariant is the way to understand the system. You know the the program and I realized that the same thing is true of concurrent systems and concurrent programs. People like to write proof you know behavioral proofs reasoning about sequences.

And the problem with that is that the number of sequences possible sequences you know is exponential in the length of the sequence while the com so your complexity of your reasoning gets to be very complicated. It's very easy to to miss cases. Um but the complexity of an invariance proof the complexity of the invariant basically is well oh god it's the number of possible executions is exponential in the number of processes but the uh the behavior of the proof of a an invariance proof is quadratic in the number of processes. you know that's basically why invariance proofs are better but you know there's still for a long time that you know people you know doing uh distributed systems theory are trying to do it uh you know develop you know methods and formalism something that are based on partial orderings and that they've you know published a lot of papers but it's just you know not the way if you want to do it in practice that's that's not the way to do it and I shouldn't say you know it's not the way uh you know there are algorithms like the bakery algorithm that you know you know thinking in partial orderings is in fact a very good way of doing it but those are the exceptions the the work the method that works you know that you can be sure will will will work is the use of invariance.

The Byzantine Generals Problem

Host: I want to talk about the I guess the next paper which uh is the Byzantines general's problem. I think that's something that we hear about and we learn about when you're going through college and computer science and the name is great and I want to know the the story behind that problem.

Leslie Lamport: After I wrote that time clocks paper that was a tells you how to build a distributed system but assuming no failures and it was obvious that um you know distributed one reason for a distributed systems is you have multiple computers so if one fails you can you know keep going. in particular uh that was the problem that it was being solved at SRRI when I uh when I joined it but before I got to SRRI and I started working on that problem and I uh there's no notion of idea of you know what I should think about is you know what what can a failure do so I assume that you know the worst possible case that a failed process might do absolutely anything.

And I came up with an algorithm that basically would uh implement a state machine uh under that assumption and that the algorithm I came out with used digital signatures. Yeah. So that it used the fact that a faulty process might do anything but it could not forge the signature of another process.

Host: which just means that the message can be trusted that it came from a private...

Leslie Lamport: right so that you can relay messages and the people know can check that the relayed message is actually the one that was originally sent uh and so that a solution using that when I got to SRRI I realized that the people were were trying to solve the same problem. Uh but there are two differences. First of all, at the time I did this was you know 1975. very few people know knew about digital signatures and in fact I don't remember when the Diffy Helman paper was published but it was around 1975 and I happen to know about digital signatures because Whit Diffy who was one of the author two authors of that paper uh was a friend of mine and in fact at one point we were at a coffee house uh and he was describing these things that he said we have this problem of building digital signatures uh you know we haven't solved and I said oh that seems easy enough and uh and I sat down and literally on a napkin I wrote out a a you know the first digital signature algorithm.

It was not practical at the time because it it required basically something like uh you know 128 bits to sign one bit of the you know of of the thing you they're signing. It's not quite that bad because you know as you might think because you could use sign not a the entire dent document but a hash of that document which you assume you know people cannot forge uh

Host: the hash they can't reverse

Leslie Lamport: yeah you can't reverse you go take a hash and and you know you find some other hash that you know or some other document that satisfies that hash. But anyway, that's why I had, you know, digital signatures were part of my toolkit. Uh, so the people at SRRI didn't have that, but they also had a nicer abstraction of it. Instead of getting agreement on a sequence among the processes on a sequence of commands, uh they would agree have an algorithm for agreement on a single command and then that algorithm would be uh executed multiple times to and you know that was a nicer way of of describing uh you know what you're doing than than than my method.

So the first paper that was published uh use gave both the their original oh so but since they didn't have digital signatures they used a different algorithm uh and they had the property that to tolerate one faulty process uh you needed four processes whereas if you used digital signatures you only needed three processes. So the original paper contained both algorithms and so I was one of the authors. The other algorithm without digital signatures is is more complicated and the general one for end processes was really a work of genius. Uh it was almost incomprehensible. You just had to read in this complicated proof that uh you know for the arbitrary case of an arbitrary number of processes you need n pro for to tolerate n faults you needed four n processes whereas with digital signatures you need three n processes and the the algorithm for single fault wasn't hard but the one for multiple four parts was uh Marshall Peas was the one who did it and just brilliant uh Later in a in a later paper I uh I discovered uh a simpler proof one that was an inductive proofly proof that if it works for n minus one you know you it worked for n with 3 n if it works for 3 n * n minus one the original paper was uh you know the original one was just brilliant uh who would have discovered it anyway um so we published that paper and I realized that this was this the whole idea of Byzantine fault.

So the thing is that Byzantine well Byzantine fault is one that where process assume a process can do anything. Now I was assuming that you know processing can do anything because you know I didn't know what to assume but the people at SRRI had the contract for building a multiprocess multi- computer system for flying airplanes and so they were the ones who appreciated the need for solving processes that can do malicious things because they they really couldn't assume what it would do. And every time you would get an algorithm and you you'd see, oh, uh, well, this algorithm, you know, try to get an algorithm with three processes, you know, for one fault, you know, you'd find that, you know, oh, you know, this this works and it must be, you know, really couldn't happen in practice. And then you'd be able to find some sequence of plausible failures that would lead the algorithm to be defeated if there were a faulty process.

So you needed four uh and for some reason you know I thought that digital signatures was almost a metaphor in the algorithm that it should be possible you know since we weren't worried about malicious failures but but you know just things that happen randomly that there should be some way of of writing a digital signature algorithm that uh you know would have a sufficiently low probability of of failing but I never worked on that and nobody else ever did. So that those that algorithm was was pretty much ignored because digital signatures were very expensive in those days. I don't know what's being done now because you know computers are digital signatures are just computing and computing is you know is cheap. Uh but uh I remember at some point I happened to be communicating with someone who was an engineer at Boeing and I asked whether they knew about those results and he said yes when that he in fact uh was the one at Boeing who would read that paper and his reaction was oh we need four four computers.

Uh but at any rate I realized that this was an important result and it should be well known and I had learned one thing from Dystra. uh Dy, you know, one of the things I learned from Dystra, he wrote this paper called the the dining philosophers problem. And that paper got a lot of attention, but the dining philosophers problem, I won't go into what it is, but I think the basic problem uh was not particularly interesting, but it had a cute story to it. It involved a bunch of philosophers sitting around a table with uh some funny kind of spaghetti that it required two forks and there was one fork between you know each fork would be shared with two people and uh but and I think realized it was because of that cute story that that problem was was popular.

And so I decided that you know this this our work needed a cute story you know a nice story and I in invented Byzantine generals with the idea being that you have a group of you know for the for the one failure case you have four generals who have to agree whether or not to attack. uh and if they all attack uh they'll win the battle. But if only some of them attack or if even if three of them attack they'll win the battle. But if only two attack you know they would lose. But one of the the generals might be uh a traitor. And so how could you you know solve this problem? And so so it's phrased in terms of these generals having to communicate and decide whether to make the single decision whether to uh attack or or retreat. Um and you know I called it the Byzantine generals uh problem.

Host: I saw in your your uh notes about the problem that there was maybe a subset of the problem or a prior version that was called the Chinese general's problem or something like that.

Leslie Lamport: Oh yeah. that yeah I was uh there was a different problem that uh Jim Gray uh described uh as an impossibility result basically it's called the Chinese generals problem and I I won't bother going into what it is and so that gave me the idea of generals uh I actually initially thought of the idea of Albanian generals because at that time Albania was a black hole as far as the rest of the world was concerned. It was a communist regime, a part of the the Soviet uh block, but it was even more Soviet than the Soviet Republic and and and you know, more restrictive. So someone uh my boss said, "Well, you know, there are Albanians in the world, so shouldn't that so should have a different name?" And then I I realized that Byzantine there aren't any Byzantiums Byzantines around and that was the perfect name.

Host: So it's interesting to me in the story that because this isn't the first time the problem was specified but it was the first time that you had named it uh um gave it a good catchy name essentially and and uh you know added some additional results. What was it that you saw in that problem that made it interesting? Or rather like how do you know that a problem is worth putting extra time into?

Leslie Lamport: Oh, well this one it was because you know the it was obvious that people were going to be building that computers were going to fly our airplane fly airplanes and the reason in fact because was was that this was during the the time of the oil crisis in the 70s and that they knew people knew that they could build more energyefficient planes by reducing the size the the size of the control surfaces. But that made the plane aerodynamically unstable. Uh and a a pilot couldn't make the all the adjustments needed to, you know, to keep it flying, but a computer could. So it was clear the future was, you know, airplanes were going to be flying be flown by computers as they are, you know, today. uh and uh people didn't realize they thought that oh if you want to be able to tolerate one fault you just use three computers and they didn't realize that you know with arbitrary faults you need four and so that a really important result and that's why I believe that it it needed to to be well known.

Problem Solving & Paxos

Host: generally when you look at the problems that you are solving with your work Um, how'd you decide? Cuz if you're working at a company, you can decide based off of maybe the I guess the impact to the company like is it going to make more money or save cost or something like that. But I wonder in your work across your career um you know think about the bakery problem or some of your later work as well. How do you know it's it's so open-ended. How do you know which problems are uh the ones worthwhile?

Leslie Lamport: Throughout my career, I worked for private companies, you know, not, you know, not in academia or or for the government. Uh, and so some problems arose because of, you know, sometimes, you know, an engineer would have a problem and come come to me. And so, uh, you know, DIS Paxos, for example, was was a case of that that somebody actually wanted an algorithm to do what it did.

Host: You mentioned earlier Paxos and I know that's one of your your most famous uh works. Curious about the story behind maybe that paper and the problem you're solving.

Leslie Lamport: Well, the problem I was trying is exactly the same problem as I was solving in the the Byzantine general's work uh basically building a a fault tolerant state machine. But by that time it was you know the in the faults that interested industry were ones where failure meant that the computer just stopped the not not that it did arbitrary things. So uh the paxos is an algorithm for uh for for building fault tolerance systems for handling that class of faults.

uh and the people I was working at was which the it was the deck circ was in which I joined in 1985 and they built a uh one of the first operating systems that uh was a a distributed operating system. Uh so that um basically everybody had the they basically these are the people who had come from Xerox Park and had invented personal computing but they also had the notion of distributed personal computing and they invented the Ethernet uh you know for that. So they basically all of the uh computers in the building were on a single Ethernet network and shared a common storage uh and they had an algorithm for maintaining consistency of that storage and I didn't believe well they didn't have an algorithm they had an operating system with code that did that um and I didn't believe that what they what they did was possible. Uh, namely I I didn't think um well I forget exactly why I didn't think it was possible but at any rate I started you know trying to uh come up with a a an impossibility proof and start solidity proof well and an algorithm to solve this would have to do this and in order to do this it would have to do that and at some point I stopped and said oh this isn't a proof. It it can't. This is an algorithm that does it.

Host: You said that they had code but not an algorithm.

Leslie Lamport: Yeah.

Host: Um what do you mean by that?

Leslie Lamport: when most people sit down and start writing programs that you know they start by thinking in terms of code and one of the things I learned fairly early in my career I don't remember exactly when that back in in the days when I started writing algorithms people talked about people were calling them programs and I was probably calling them programs too I mean I remember then at some point I realized that that wasn't wasn't talking about programs. I was talking about al interested in algorithms.

Uh and an algorithm is something that's more abstract than a pro than than a program. U an algorithm can be you know a program is written in in some particular code. But an algorithm can be implemented if programs written in any any kinds of code. It's it's something that's that's at a higher level of of of abstraction. And of course I like that because abstraction is something I'm I was good at, you know, even without realizing that that's what I was doing.

Uh and so what I've spent a large part of my career basically from maybe about you know 2000 or so onward uh was getting people who build concurrent systems uh to not just write code but to have an algorithm. Now a system does lots of things but there should be some kernel of the of the program that's that's involved with synchronizing the different processes or the distributed system the different computers and that code you know is very hard to get you know the you know that correct so you you don't want to think in terms of code because static coding, you know, conflates, you know, a lot of issues that are irrelevant to the concurrency aspect. And so you should be thinking, you know, first get an algorithm that does that synchronization and then implement that algorithm.

Host: I was looking at the Paxos paper and uh some of your notes about it and I saw that um there's a there's an eight-year gap between when you came up with the algorithm and when the the paper was actually published called Part-Time Parliament is the name of the paper. Why why is there an eight-year gap?

Leslie Lamport: Oh, well the re the referees originally said well this paper is okay you know not terribly important but fortunately Butler Lamson realized the importance of the algorithm and together with the idea of you know I guess you can implement anything because it's implementing a state machine uh and you know went about proceed uh procilitizing uh building your systems you know using paxos uh you know and and thinking in terms of state machines and uh so you know I wasn't uh so the idea was getting out so you know I was in no hurry to publish so you know I just let the paper sit and eventually uh there was a new editor that came along and uh uh he said that you know I think the status of the paper was that it was just uh you know it had been accepted but had no and needed revision and uh so he decided that yeah let's you know to to publish it and uh it was eventually published with little some a few things that uh to take well to to mention work that had been done in the in the in the uh interim and what I got is uh got a Keith Marzulo uh you know to do that part for me.

Uh and uh so the story was that this manuscript this was that well the story about Paxos was that you know was a this happened you know centuries ago and you know this manuscript and uh I used that to the effect that you know when something you know the tales of something were I considered obvious and you know not interesting you know the the paper would say it's not clear how the Paxons what the Paxons did you know at this point but Um at any rate and uh so uh and Keith you know kept up the that idea that you know this was a you know a description of this ancient thing and and he wrote a you know a little prefix or a preface or something to to it and uh you know added maybe I think some uh references.

Host: I saw in your writing too when you were talking about presenting the paper initially, you even uh dressed up in like an Indiana Jones style archaeologist. Well, how did that go when you presented about this Paxos uh paper and algorithm?

Leslie Lamport: Well, I think the the lecture may have gone well, but uh I think nobody understood the algorithm where nobody understood the significance of the algorithm.

Host: It sounds like no one understood it except for Butler Lamson. What what did he see that made him unique? I guess.

Leslie Lamport: Well, he had a good understanding of building systems, you know, he really deserved his touring award. He was one of the original people at Xerox Park who were building distributed uh personal computing. He and Chuck Thacker, I think, were probably the two senior people, you know, in that lab.

Paxos vs. Raft

Host: I saw later there was a paper which describes a new algorithm which seems to solve the same problem. The raft paper. I was wondering if you read that and what your thoughts were on on that versus Paxos.

Leslie Lamport: The authors of that actually sent me a draft of the original paper and I looked at it and said uh I forget whether I said send it back to me when you have an algorithm or send that back to me when you have a proof. I I forget which one it was and uh you got the idea and they really they they did write you know add a proof in the paper or not. Uh yeah and I never read future later versions and someone whose judgment I value said you know had read it and said that it's basically it's it's the Paxos paper but no but with some of the the tales left unfinished by the Paxos paper uh by uh you know filled some of the tales filled in but they you know described it in a in a very different.

Hey, the basic idea of the what Paxos works is it's two phases and you're trying to implement a sequence of you know of decisions and it turns out you can do the first phase once for a whole it involves a leader. So um and the leader has to get elected. Uh so but it turns out that you can do the first uh phase once uh and you don't have to do it again as long as you have the same leader. Uh but it's only the second part that you have to do and then you have to elect the the new leader if a new leader fails and do the first part.

So think about it in those two phases. But the way people the way engineers you know like to think about it is well you do this you know you talking about the first part the the second phase you keep doing this uh until the leader fa fails and then you go back then you have to do this thing so it's explaining it in the in the opposite order uh and in fact you know when you start it from from fresh the uh you don't have to do the first uh uh the first phase you can you basically what what's done in the first phase could be just built in into the initial state but you know I think that that's the right you know of those two phases the way to understand it.

Uh but you know the raft people also had this idea that you know raft is better because it's simpler and I I must say that a lot of people say that uh Paxos is hard to understand and I don't understand why. I mean, I've explained it to some people in five minutes and they understood it. At any rate, the raft people said that one of the ideas were simpler because and they even have, you know, taught, you know, Paxos to one class and and uh the raft to another and they took and then yes, the people all the students said that yes, it was more understandable.

Uh the interesting thing about it though is that uh there was a bug discovered in raft and fixed but I believe the algorithm that they found more understandable was one with that bug. So uh made me realize that uh you know what most people you know what does understanding mean and for me understanding means you know you can write a proof of it but what understanding means for most people is warm fuzzy feeling and you know the raft description gave them you know more of a warm fuzzy feeling because you know you you know that that was seems to be the way you know programmers, you know, like to think about the the algorithm, you know, you know, the second phase, you know, first until, you know, you get a failure and uh but the way I describe it is one that helps you get a better understanding of why it actually works.

LaTeX

Host: So, yeah, we talked about a lot of your papers. I know one of your other uh contributions whether you knew it or not at the time was latte and uh building that and something that has impacted the entire academic community. What's the story behind wanting to build latte?

Leslie Lamport: Oh, that was uh very simple. Um, I was wanted I was in the process of starting to write a book and uh it was clear that tech was the basic uh type setting system that one had to use. But you know I felt that I would need macros uh to make tech do what I wanted it to do. And uh so I decide figured with uh been a little extra effort uh I could make the macros usable by other people. The system I had been using before tech it's called scribe and uh that really had the basic idea of scribe was that you describe the logical structure of of the document not and the and scribe will do the formatting. Well, scribe didn't do that great a job of formatting. Uh so, uh but obviously, you know, I like the idea abstraction that it's the ideas that matter, not the text that ma, you know, the the writing that matters, not the type setting.

And so, um, I actually at some point, uh, I met Peter Gordon, Addison Wesley, uh, I'm not sure what what you would call him, but he looks for, you know, books to publish. And, uh, he convinced me that I should write a book on it. And those days, it never occurred to me people would actually spend money for a book about software. But you know what the hell? And what he did was he introduced me to uh a typographic designer at uh Addison Wesley who was responsible for really for the typographic design that's in the standard uh latex styles. You know, basically I just did that in my quote spare time. You know, took me six or nine months or so. I I suppose the uh statute of limitations has run out, but I was really, you know, spent some time working on that when I was allegedly, you know, billing the time to some project that had nothing to do with it.

Writing, Thinking, and Proofs

Host: On the topic of writing, you have a quote that I really enjoy. It's if you're if you're thinking without writing, you only think you're thinking. And I was curious to hear your thoughts on what you mean by that.

Leslie Lamport: Well, it was really meant for, you know, people building computer systems. You have an idea and you think it's going to work. Uh or you have something that, you know, you think is something that somebody else will you want to use. Well, write a description of it. Uh there's an old maxim that I don't I heard uh that is you know write the instruction manual before you write the program. a great advice. Uh I did not do that uh with latte but it I definitely when I was writing the book and I discovered that something was hard to to describe hard to explain that needed to be changed and I made you know a number of uh of changes to it uh as a result of that but uh I didn't start at the beginning with the instruction manual.

Host: Why is writing conducive to good thinking?

Leslie Lamport: Because it's very easy to uh it's very easy to fool yourself. Uh I mean that underlies my uh my whole idea of of writing proofs. One thing I learned is that you had to write a a correctness proof of an concurrent algorithm. And when my algorithm was starting to get more complicated, the proofs started I started writ PhD in math. I knew how to write proofs and I was starting writing the proofs the way I would normally do. And I realized it just didn't work because there were just so many details involved and I just couldn't keep track of them and whether I had done it.

And so as a computer science know how to deal with concurrency uh it's hierarchical structure and so I devised this hierarchical structure where a proof is uh you know is a sequence of steps each of which has a proof and the proof is either a par well a proof is either a paragraph or a statement a sequence of steps each with its proof and that proof can be either a parag graph or a sequence of steps with its proof and you know so you break the whole problem up into these smaller pieces. So there's never any question of you know where is this coming from. You know you're stating that this step follows from you know this step this step this step this step and if it does not follow from that step your proof is wrong. The theorem might be correct but but means your proof is wrong.

Um well you know so I've discovered that worked great on writing my proofs of programs but I decided to really you know I also write proofs of theorems you know you know uh you know think proofs that are things that are you know more like ordinary math and I started trying that on them and I discovered it worked beautifully. So when I started to to try to convince mathematicians to write these proofs uh I started in one small seminar I went you know won't describe what it was about but uh and I I described this this proof through maybe uh 20 mathematicians or something their reaction shocked me they became angry I really thought that they might physically attack me.

So I believe that what's going on is that when pe I mean I believe that's totally irrational and when people act irrationally it tends to be out of fear and what I believe people are afraid of is that mathematicians are afraid of is that they're going to have to write their proofs to convince a a computer program and and in fact you know and I give it a one of those talks I gave you know I say very clearly this doesn't have to be you don't have to be any more formal than you do you can write the exact same thing proof you know but it's just a matter of organizing things and it's very simple you know hierarchical structure and then when you're using a fact mention that you're using that nothing about formalism or anything you know after I gave that talk someone got up and said I don't want to have to write my proof my my proofs for a computer program.

And in fact, it's more work doing that because the reason it's more work is that it reveals what you haven't said and that there's steps in there that you know, you may think they're obvious, but you haven't written them down. And if you believe something is correct but don't really if you if you think you know something but don't write it down, you only think you know it. And that's where errors come in. You know, that's where that one-third of your paper's errors can, you know, you know, uh, come in because it really makes you honest.

Career Reflections: Industry & Academia

Host: When I look across your career, I think you had a lot of contributions people might expect might come from academia, these papers and things, but you did uh all of your work in industry. Why did you not see yourself as a academic and more of uh working for industry?

Leslie Lamport: Well, I started out programming uh and I eventually got jobs where took me into what we now call computer science. At the time I never even realized that uh there was you know there could be a computer a science of computing. Uh it wasn't until you know maybe until mid to late '7s that I realized yes there was a computer science and know as a computer scientist. Um but it never seemed to me that like computer science was a an academic subject. At some point I, you know, had to make a choice between doing computer science and without calling it computer science or or teaching math at a at a university. And I I chose for random fairly random reasons to you do computer science. Uh so it you know for the first I don't know well till maybe the mid80s or something it just didn't seem to me that you know you know computer science was something that people needed to go to to a university to learn. And uh I suppose afterwards that I was sort of I guess I I just didn't think it would be fun teaching computer science. So

Host: I saw in your your writing you had a footnote that said somewhere that you you you felt like a a failure at some point because you you wanted to develop this grand theory of concurrency and you never discovered it. Um do do you still feel that way or what are your thoughts on that that footnote?

Leslie Lamport: lots of people who you know a large percentage of the people who were doing things like I was doing which is not a large number of people uh there's this notion that uh they're looking for the touring machine of concurrency you know the touring machine was this abstraction which really captured what computing was uh and they were looking for something that would be the you know the touring machine of of concurrent computing and you know nobody succeeded. I mean there are some people who think they've succeeded. Uh the patronets are are something that uh I guess I don't have time to to explain but uh there was a big it was big in the 70s. Uh, and I was actually surprised to think that there's still a large community of people doing uh, patriets. But what I now realize is that patriets and most of the things that people were doing was really language-based.

And I was never interested in languages. I'm interested in what the language is expressing. And you know I realized in some sense you know maybe I've realized what the touring machine of of of computing is state machines. Uh state machines are a little bit different the way I now describe them. They don't have commands. They just have a state and a and a next state relation. Uh even simpler than talking about commands and and and values and stuff. Uh and you know to me you know that's the uh that's the touring machine of of con of concurrency but it uh it it doesn't have the function that that touring machines offer because it it doesn't what touring machines do is uh describe what's you know what's possible uh and state machines can describe anything including things that are not possible.

Uh and and in fact uh the there's a good reason for that. Um for example uh when I describe a uh an algorithm I will talk about you know the values of a variable you know can be any integer. Now you can implement the program where you have any integer uh but that makes the but talking about you know computer integers would complicate things unnecessarily. The people have this funny idea that you know because something is infinite it's more complicated. They got it backwards. Infinity was introduced to simplify things. You know the first thing you learn is arithmetic. You're learning arithmetic with an infinite number of integers because if you restricted to a finite set of integers, arithmetic becomes much more complicated.

So you know the abstractions of mathematics uh which people find you know because they don't have the proper training in mathematics find you know difficult uh are really what's simplifying things and that's what you what you use this mathematics the state machine is described for me by me using mathematics that's the right you know the the most powerful way of doing But computer people and computer scientists and programmers are really hung up on languages and so they are looking for you know they invent all sorts of languages and they're all describable and in fact if you want to give them a semantics you would do it in terms of a state machine and they just think that this uh you know this language ES improves your thinking. uh it doesn't you may I mean there are reasons why you use computer languages and you don't write your your programs code in math and they involve basically efficiency but for understanding you know you can't build math you can't beat math and you know attempts to uh do it by something that looks like a programming language uh is is just the wrong way to to to deal when you're trying to deal with concurrency.

Host: When I look at uh everything that you've written and all the stories, there's these little anecdotes. There's things where you say things like you you never considered yourself smart, but you noticed that other kids had an awful time understanding things or yeah, there's a problem that you solved where someone else had difficulties, but you don't view your contribution as a brilliant one or anything like that. And that that uh doesn't connect with me because you've also won a touring award and done all these amazing things. So, how could that be that you, you know, just merely discover things and are are not smart yet you've achieved so much?

Leslie Lamport: Well, this general thing that, you know, psychologists talk about uh is that when someone is good at something, they don't realize how they're good they are at it because it's simple to them. There's the opposite one that uh people who are bad at something think they're better than they are because they're bad at it. Or to put uh a little bit more concisely, stupid people think they're smart because they're too stupid to realize they're not. Uh my the gift that I have is not in some sense raw intelligence. It's abstraction and it's only recently, you know, the last 10 or so years that I realized how much better I am at that than other people, most other people.

Host: At this point, you've experienced so much and you when you look back on your career. If you could go back to yourself when you just graduated college and give yourself some advice knowing what you know now, what would you say?

Leslie Lamport: One thing I've learned fairly early in my life is that I shouldn't waste time trying to answer questions that I don't have to answer. I don't think about, you know, what I should have done because uh that's a question that I don't have to answer.

Outro

Host: Thank you for listening to the podcast. It's a passion project of mine that I've really enjoyed building. Another passion project that I've been working on kind of in secret is building an ergonomic keyboard that I wish existed and I finally have a prototype. So, I'd love to show you what we've built. It's ultra low profile and ergonomic. and I couldn't find anything like it on the market. So, that's why we built it. I'll put a link to the keyboard in the description. You can take a look and learn more about the project there. We could definitely use your support. Also, if you have any feedback for me about the show, I'd love to hear it. Comments on YouTube have led to guests coming on like Ilia Gregoric and David Fowler. I wasn't aware of them until someone dropped a comment. Also, feedback in the comments helped me learn to reduce the number of cliffhers in the intros. So, your comments definitely make a difference. Please keep letting me know what you'd like to see more of in the show, and I'll see you in the next episode.

Interview with Leslie Lamport: Turing Award Winner