**A BOOK THAT CHANGED MY LIFE**

My career was made possible by a combination of privilege, talent, luck, and effort, with early success leading to access to opportunities for later success. Someday I may try to tease apart all those strands, but today I’ll just mention a few of them, starting with a book I won as a contest prize in high school: Hardy and Wright’s “An Introduction to the Theory of Numbers”. My favorite chapter was chapter 19, entitled “Partitions”.

A partition is a way of writing a positive integer as a sum of one or more positive integers, where we list the parts from largest to smallest and we allow repeats. Two examples of partitions of the number 19 are 6 + 5 + 5 + 3 and 4 + 4 + 4 + 3 + 3 + 1. The first partition has exactly 4 parts; the second has largest part 4. There’s a beautiful pictorial way to show that for every *n* and *k*, the number of partitions of *n* with *k* parts equals the number of partitions of *n* with largest part *k.* The proof gives a one-to-one correspondence, or *bijection*, between the set of partitions of *n* with *k* parts and the set of partitions of *n* with largest part *k*; see Endnote #1. Although Hardy and Wright’s book is about number theory, this argument is really an example of what nowadays would be called a combinatorial proof, or more specifically, a bijective proof. Early exposure to this kind of argument gave me a love of bijective proofs and pictorial combinatorics.

**A TALK THAT CHANGED MY LIFE**

Fast forward to my years of graduate study at U. C. Berkeley. I had won a National Science Foundation Graduate Fellowship on the strength of what I’d done in high school and college, and the Fellowship gave me access to travel funds which I dipped into to pay for my participation in the West Coast Number Theory Conference, held each year at the beautiful Asilomar Conference Center. If I hadn’t gotten the Fellowship, I wouldn’t have gone to the conference, and then I wouldn’t have heard Jeff Lagarias talk about his work on tilings with John Conway.

The problem Conway had tackled was, given a triangular array of dots like the one shown below, can you ever cover the dots with line segments that don’t touch each other, where each segment covers three adjacent dots in one of the three directions?

The picture shows six segments that cover all but three of the points. Is there a way to arrange seven segments that cover all the points? Is there a way to cover all the points if I choose a bigger triangular array of dots? Conway had shown that the answer was “No”, and Lagarias had worked with him to develop the idea further.

As a result of my getting to know Lagarias and corresponding with him by email, I had a chance to read an early version of an article by Bill Thurston presenting his geometrical take on the work of Conway and Lagarias, and this in turn enabled me to wonder, “Hmm, how can I do pictorial combinatorics using Thurston’s approach?” As I’ll explain in an upcoming publication in the *Mathematical Intelligencer*, pursuing these wonderings led me to look at regions like this:

Nowadays this is called an *Aztec diamond* of order 3. An Aztec diamond of order *n* has rows of length 2, 4, 6, …, 2*n–*2, 2*n*, 2*n*, 2*n–*2, …, 6, 4, 2. Here’s a way of covering the Aztec diamond of order 3 by 1-by-2 and 2-by-1 rectangles:

We call these rectangles *dominos* and we call the configuration of dominos a *tiling* of the Aztec diamond of order 3.

Drawing lots of pictures, I found that the number of domino tilings of the Aztec diamond of order *n* (for *n* = 1, 2, 3, and 4) followed the pattern 2, 8, 64, 1024. These numbers are all powers of 2; specifically they are 2^{1}, 2^{3}, 2^{6}, and 2^{10}. And those exponents 1, 3, 6, 10 aren’t just any old numbers; they are the triangle numbers *T _{n}*, known to humankind since antiquity and given by the formula

I didn’t know it at the time, but I wasn’t the first to make this conjecture. Physicists Grensing, Carlsen, and Zapp had proposed this formula back in 1980. But they didn’t prove the formula, and more to the point they didn’t think it was worth proving because for their purposes the numbers given by the formula were too small. To explain what I mean by “too small”, let’s switch over to looking at domino tilings of the 2*n*-by-2*n* square. In the 1960s physicists Fisher, Kasteleyn, and Temperley had found an exact formula for the number of domino tilings of a 2*n*-by-2*n* square and they’d shown that when *n* is large the number of tilings is close to 1.34 to the power of the area of the square. But for a large Aztec diamond, Grensing, Carlsen, and Zapp’s conjecture implies that the number of tilings is exactly 2^{1/4} (or about 1.19) to the power of the area of the Aztec diamond. The fact that 1.19 is less than 1.34 means that domino tilings of Aztec diamonds are more tightly constrained than domino tilings of squares, and this made the former less relevant to the kinds of questions the physicists were interested in. So although they stated the conjecture, they didn’t spend any effort figuring out how to prove it.

**PEOPLE WHO CHANGED MY LIFE**

Once I had a conjecture that I couldn’t figure out how to prove on my own, I needed to bring in other people to help me. And here I had another advantage: doing well on the U.S.A. Mathematical Olympiad as a high schooler had given me a chance to spend two summers attending the training program for the U. S. Math Team and to get to know a then-young mathematician named Michael Larsen. When I couldn’t solve the counting problem on my own, I mentioned it to Michael, who mentioned it to an even younger Noam Elkies. Noam found the first proof, and shortly thereafter Michael found the second.

Knowing that Michael and Noam had proved that formula correct settled the question but didn’t end my quest, because different mathematicians are satisfied with different sorts of answers to the question “Yes but why?”, and neither of their proofs was the kind of explanation that satisfied me. I wanted a bijective proof like the proof I mentioned earlier about partitions of numbers, and the fact that my conjecture involved powers of 2 made me convinced that such a proof could be found.

The question “In mathematics, what are there 2^{N} of?” has many correct answers. One answer is “The number of strings of *N* symbols in which you have 2 choices for each of the *N* symbols and each symbol can be chosen independently of every other.” You could for instance look at strings of H’s and T’s of length *N*, corresponding to the 2^{N} different outcomes for an experiment in which you toss a coin *N* times and keep track of each toss individually. The formula 2^{n(n+1)/2} suggested that there ought to be a way to encode each and every tiling as a string of 0’s and 1’s of length *n*(*n*+1)/2.

Here I had another bit of advantage: I held a National Science Foundation Postdoctoral Fellowship that enabled me to be at Berkeley, which enabled me to get to know Greg Kuperberg and to interest him in the problem. We worked together and came up with domino shuffling, which is the kind of bijection I was looking for, or something close enough.

What is domino shuffling? I won’t tell you, but I will show you. Here are the names of some of Mathologer’s enthusiastic fans, along with links to the implementations of domino shuffling that they created in the 24 hours after Mathologer posted his video about the Aztec diamond:

Dmytro Fedoriaka: http://fedimser.github.io/adt/adt.html

Philip Smolen: https://tradeideasphilip.github.io/aztec-tiles/

Bjarne Fich: http://rednebula.com/html/arcticcircle.html

chrideedee: https://chridd.nfshost.com/tilings/diamond

The Coding Fox: https://www.thecodingfox.com/interactive/arctic-circle/

WaltherSolis: https://wrsp8.github.io/ArcticCircle/index.html

Jacob Parish: https://jacobparish.github.io/arctic-circle/

Now *that* is some fan-base!

In writing up the work I’d done with Elkies, Kuperberg and Larsen, I dubbed the shapes we’d studied “Aztec diamonds” because the design can be found in much pre-Columbian art. I tried to figure out if there was a specific group of Native Americans most closely associated with the motif but it seemed to be shared between many nations. I eventually concluded that the Hopi made the most use of it, but “Hopi diamond” didn’t sound as good as “Aztec diamond”, so I chose euphony over ethnographic accuracy.

**BEYOND COUNTING**

The gap between the bases 1.19 and 1.34 told me that domino tilings of Aztec diamonds must exhibit less variety than domino tilings of squares, and suggested that a random domino tiling of a big Aztec diamond would look different from a random domino tiling of a big square. If I wanted to explore this phenomenon experimentally, domino shuffling was precisely the sort of tool I needed: all I had to do was use *n*(*n*+1)/2 coin flips (or some computer-generated surrogate) to get a random bit-string of length *n*(*n*+1)/2 for some large-ish *n* and then use the shuffling algorithm to convert the string of bits into a domino tiling of the Aztec diamond of order *n*.

I didn’t get around to using domino shuffling to study random tilings until I finished my postdoctoral work at Berkeley and joined the MIT math department as an assistant professor. Coming to MIT was a bit of a gamble because at the time MIT was notorious for not giving tenure to assistant professors whose main interest was combinatorics. At times I feel wistful about academic roads not taken; where would I be now – *who* would I be now – if I’d succumbed to the attractions of liberal arts schools and sought a job at a place like Swarthmore or Williams? But having enjoyed my postdoctoral years at Berkeley and the boon provided by the chance to collaborate with Greg Kuperberg, I thought that being in the Boston area with potential collaborators at MIT and Harvard would prove to be very beneficial for my research. And I was right, though I wouldn’t have guessed just how young many of my future collaborators would be. MIT had a well-established and well-funded program called UROP (Undergraduate Research Opportunities Program) that few of the professors in the math department were taking advantage of, so I mostly had all the bright math undergrads to myself. One such undergrad was Sameera Iyengar; I hired her to implement domino shuffling, and she generated a picture that looked something like this:

Grensing, Carlsen, and Zapp had known that domino tilings of Aztec diamonds are more constrained than domino tilings of squares, but hadn’t guessed that these extra constraints make themselves felt in different ways in different parts of the Aztec diamond. Sameera’s pictures made it clear that most of the randomness shows up in the middle. I realized that what I’d been looking at was much more than a fun new kind of counting problem; it was potentially a testbed for studying the ways in which geometrically constrained systems like tilings could exhibit propagation of constraints from the boundary of a system to the deep interior.

To get the beginnings of an understanding of propagation of constraints in tilings, consider the middle cell of the northwest border of an Aztec Diamond of order 5, marked by a question mark in the left panel of this picture.

That cell must either be covered by a horizontal domino that it shares with the square to its right or a vertical domino that it shares with the cell below it. But if the “?”-cell is covered by a horizontal domino, marked “1” in the right panel of the picture, then that forces the placement of a second domino, marked “2”, which in turn forces the placement of a third domino, marked “3”. Likewise, if the “?”-cell is covered by a vertical domino, then that also forces the placement of two more dominos. (You can view this cascade of causation as a metaphor for the way access to opportunities, or lack of access to opportunities, can cascade in one’s professional career. I’ve already described some ways in which I was the beneficiary of a positive cascade effect. Here’s an example of a negative cascade that I fortunately haven’t fallen prey to in my own career: If you don’t publish enough, some universities will give you more courses to teach, which cuts down on how much time you get to spend on research, which makes it hard to write articles for publication.)

In the 1990s I was able to recruit many Boston-area undergrads and graduate students to the study of tilings. Grad student William Jockusch and my former Math Olympiad Program roommate Peter Shor helped me find the first proof of what I called the arctic circle theorem, showing that if you choose a random domino tiling of an Aztec diamond of order *n*, then with high probability there’ll be four completely nonrandom-looking “frozen regions” in the corners where the tiling looks like brickwork, and a central “temperate” region where the tiling looks fairly random, and the boundary between the frozen regions and the temperate zone converges in shape to a perfect circle as *n* goes to infinity. (See Endnote #2.)

In math, what makes a question good isn’t always obvious from the question, or even from the answer; it’s about where the question leads, and what kind of story it gives rise to. In later years, many people moved into the study of domino tilings of Aztec diamonds, using tools from a wide variety of mathematical disciplines such as differential equations, random matrix theory, algebraic geometry, integrable systems, and even physics. Just last week, Rick Kenyon told me about a new result about Aztec diamonds he’s obtained in work with Cosmin Pohoata. So the story isn’t over.

**LESSONS**

Are there lessons in this story for others who seek careers as research mathematicians? I might have thought so before the pandemic hit; now it’s unclear to me what academia is going to look like in the years ahead. Given a choice between a tenure-track job at a good place or a non-tenure-track job at a great place (such as a postdoctoral position or a nominally tenure-track job that in practice might more accurately be described as “tenure-plank”), which should you pick? It’s a tough problem, but I will say that surrounding yourself with smart people is a good recipe for doing your best work.

Speaking of the pandemic: one positive effect of the disappearance of in-person seminars is the burgeoning of online seminars. Attending research talks by top mathematicians is now within the reach of anyone with an internet connection and the time to make use of it. There’s a potential for democratization of doctoral education and math research that I hope will continue even after the pandemic is over. You’ll remember from early in this essay that I benefitted from going to a conference as Asilomar where I got to hear Lagarias talk about his work with Conway. Maybe other young people would have gotten the same inspiration from Jeff’s talk as I did but didn’t get to go to Asilomar. Hopefully in the new world of math research that’s being assembled from the pieces of the pre-pandemic world, more people will get the kind of chance that I got.

Even more important than the talks that I went to were the people I had the chance to collaborate with. There’s a ratchet mechanism at work here: collaborating with good people enables you to do good work, and doing good work gives you access to good collaborators. We need to give more people the opportunity to take advantage of the ratchet. When people want to collaborate with me, I try hard to say “Yes”.

Another lesson is the importance of recognizing opportunities when they come your way. Aztec diamonds weren’t first discovered by me; they were first discovered by physicists who didn’t recognize their lucky find. By being one of the first to jump into the study of Aztec diamonds and related structures, I was able to spend an exciting decade pushing back the frontiers of one corner of mathematics. The mathematician Gian-Carlo Rota (quoting someone else, but I forget whom!) wrote “When pygmies cast such long shadows, it must be very late in the day,” but I like to turn that around and say that short people can cast long shadows if they show up early enough.

I gave a talk at the 11th Gathering 4 Gardner conference called “Conway’s Impact on the Theory of Random Tilings” that covers some of the same ground as this essay. If you look carefully you’ll see that I’m wearing one of my “Random Tilings Research Group” shirts.

One question haunts me from time to time: Where did that three-dots-in-a-line problem Conway worked on come from – the one that shaped my career in such a deep way? I have a hunch it originated with mathematician David Klarner, but I’m not sure. If any of you have information about this, please let me know!

**ENDNOTES**

#1: Here is a partition of the number 19 depicted as what is called a Ferrers diagram:

Reading by rows we get the partition 6+5+5+3, a partition with 4 parts; reading by columns we get the partition 4+4+4+3+3+1, a partition whose largest part is 4. This gives us a bijection between partitions of 19 with 4 parts and partitions of 19 with largest part 4: given some partition with 4 parts, use the part-sizes as the row-sizes of a Ferrers diagram, and then use the column-sizes of that same diagram as the parts in a partition whose largest size is 4. Or if you prefer you can flip the diagram across its diagonal. The same construction gives a bijection between partitions of *n* with *k* parts and partitions of *n* with largest part *k*, for all positive integers *n* and *k*.

#2: The term “arctic circle” is perhaps a little misleading for Earthlings, since on Earth the frozen arctic region is *inside* the arctic circle while in an Aztec diamond the four frozen regions are *outside* the arctic circle, but the name has stuck.

It still amazes me that the boundary of the temperate region tends to be circular in the limit as *n* approaches infinity. Why is it a circle, and not some other curve? I don’t have an intuitive explanation for this.

**REFERENCES**

J. H. Conway and J. C. Lagarias, Tiling with polyominoes and combinatorial group theory. Journal of Combinatorial Theory Series A, Volume 53 (1990), pp. 183-208.

G. H. Hardy and E. M. Wright, An Introduction to the Theory of Numbers. 1938.

W. Thurston, Conway’s Tiling Groups. The American Mathematical Monthly, Volume 97 (1990), pp. 757-773.

]]>Before you can visit that other world — my second home — you have to imagine it, which is trickier than it sounds: sometimes you *think* you’re imagining it but in fact you aren’t. It helps to describe what you see in your mind’s eye to someone who’s been there, who can help you determine whether you’re genuinely imagining the other world or deluding yourself. (For instance, it may look like you’ve squared the circle, but someone with a good grasp of geometry can help you see that you haven’t.) Ultimately, when your imaginings of the other world are properly calibrated, you can go there — though “go there” is a misleading phrase, since all of us who visit the other world are engaged in nothing more than calibrated imaginings. But surely the place is real, for how else can you explain why two visitors, independently exploring precincts of that world that no one has ever visited, will see the same things?

Of course, I am talking about the world of pure mathematical form. One recent visit I took to that world prompted me to write the following in a succession of tweets:

My research life this week, in allegory: I’m exploring a magical landscape with some friends, and we’re teaching each other about the terrain. I say “I think there’s a castle on the other side of that hill,” and we climb it, and sure enough, there’s a castle. They get to work turning on the lights and fixing plumbing with impressive speed. Meanwhile, they wonder how I knew there’d be a castle there. I’d like to be able to say “I’ve been climbing hills for forty years, so I’ve learned to recognize the sorts of hills that lead to castles.” But I haven’t. Most of the hills I climb still don’t have castles behind them. But what I’ve learned is that you don’t find castles if you don’t climb hills.

Some features of the mathematical world are easy to imagine as idealizations of experiences in our physical world: for instance, a dot that shows no sign of internal structure leads you to the idea of a perfect Euclidean point, even if there are no such things in our world. Other features of the mathematical world can’t be seen on a second, tenth, or even hundredth visit, not because they’re hidden in some remote part of the math world that’s hard to go to, but because they can’t be apprehended without whole new sensoria — sensoria that are peculiar to the nature of the math world, and that individual visitors must painstakingly construct for themselves over a long period of time.

I tweeted about this aspect of math recently, writing in the terse, Figure-out-the-context-yourself style common on Twitter:

Like Narnia, but it takes years of study and practice to learn how to get there, and when you’re there, your earthly body stays behind with a vacant expression on its face and your spouse says “You’re doing Narnia now, aren’t you?”

My tweet was evidently too terse because some readers mistakenly thought I was talking about meditation, not mathematics.

**PIRANESI**

What prompted the tweet was my reading a paragraph from a fantasy novel describing a character’s passage from one world into another, and my sudden conviction that jumping from our world into a very different one *couldn’t* be as simple as doing a ritual — and my realization a few seconds later that the strength of my conviction arose from my own experience of becoming a mathematician and my over-strong identification with the book’s hero.

Susanna Clarke’s “Piranesi” is about a man who —

Wait, hold on a minute. This essay is full of spoilers, so if at any point my description makes you suspect you’d enjoy “Piranesi”, **stop reading immediately**, get a copy of Clarke’s book, savor it, and then (and only then) come back to “Children of the Labyrinth”. Essays celebrating the joys of math are a dime a dozen (heck, you get mine once a month for free!), but a book by Susanna Clarke is a once-in-a-decade treat.

Where was I? Oh yes: “Piranesi” is about a man who —

Sorry, my wife says I’m not being clear enough.

Okay, here we go, for real. Piranesi lives in another world, having arrived there from our world but having subsequently lost all memory of where he comes from. (Actually, he knows some things about our world but doesn’t *know* that he knows them, having no place for this knowledge in his conception of the world.) He dwells alone in an enormous building that he calls the House (or the World). The House is something like a vast abandoned art museum, susceptible to periodic flooding by predictable but mysterious Tides. The story is told through entries in a journal that Piranesi faithfully keeps. To him, the House, despite its dangers, is a place of beauty and repose that provides for all his wants, and though he never compares the World to a womb, I couldn’t help feeling that Piranesi has reverted to the life of the not-yet-born.

The name “Piranesi” isn’t the name his parents gave him in our world, nor is it a name he chose for himself; it has been bestowed upon him by the only other person he sees in the House, whom he calls the Other. The Other knows things about the House that Piranesi doesn’t, but Piranesi isn’t eager to learn what the Other knows; Piranesi is content knowing that he is the Beloved Child of The House, free to wander its Halls, marvel at its Statues, draw fish from its Tides, and lovingly tend the corpses he has found (yes, the book is a bit of a murder mystery as well as a fantasy novel). For his part, the Other has very little interest in any of the above; he only cares about the magical powers he hopes to obtain with Piranesi’s help.

As the tale progresses, other people come into the House, and Piranesi begins to question his own understanding of who and where he is. Before that happens, Clarke has already revealed that Piranesi is living in a “tributary world”, created by a sort of spiritual runoff from our own (which explains why the statues he loves all depict things from our world). One feature of the tributary world is that spending too much time there brings amnesia — hence Piranesi’s inability to recall where he comes from.

At one point, Piranesi, having come by degrees to understand that his current world is not the only world that exists, tells a visitor (whom I’ll call X) what he understands of the relationship between the older world that gave rise to the House and the House itself: “In this World the Statues depict things that exist in the Older World.” The following exchange ensues:

“Yes,” said [X]. “Here you can only see a representation of a river or a mountain, but in our world — the other world — you can see the actual river and the actual mountain.”

This annoyed me. “I do not see why you say I canonlysee a representation in this World,” I said with some sharpness. “The word ‘only’ suggests a relationship of inferiority. You make it sound as if the Statue was somehow inferior to the thing itself. I do not see that that is the case at all. I would argue that the Statue is superior to the thing itself, the Statue being perfect, eternal and not subject to decay.”

You can see why this makes me think of math.

**BEFORE, AFTER**

I don’t find that doing math makes me forget things about the real world, though I do at times wonder whether my mathematical propensities have recruited portions of my brain that evolved for other purposes, and that, as a result of their repurposing, I am under-equipped for this world. For instance, would I be better at remembering people’s names if I didn’t spend so much time learning the particularities of all the mathematical forms that fascinate me?

But there *is* a kind of amnesia associated with learning mathematics, familiar to many teachers, especially those more devoted to research than to teaching, namely: *the forgetting of what it is like not to know the things one knows*. This is especially true when, as is so often the case with math, the things one knows are not facts but perspectives and habits of thought. What seems like a straight path to the adept can seem like a tortuous labyrinth to the novice.

The network of Halls that Piranesi inhabits seems like a labyrinth to other visitors to his world, and one of them asks him: “How long did it take you to learn it? The way through the labyrinth?” Here is his reaction:

I opened my mouth to say loudly and boastfully that I have always known it, that it is part of me, that the House and I could not be separated. But I realised, even before I spoke the words, that it was not true. I remembered that I used to mark the Doorways with chalk in exactly the same way that [X] did and I remembered that I used to be afraid of getting lost. I shook my head. “I don’t know,” I said. “I can’t remember.”

All too often, that’s me as a teacher.

There’s a line from the movie “The Paper Chase”, in which the fearsome Professor Kingsfield tells a room of first-year law-school students “You come in here with a skull full of mush … and you leave thinking like a lawyer.” This raises the question, will the newly-credentialed future selves of these first-years still be able to *not* think like lawyers? Or does the education process take away with one hand even as it gives with the other? Consider how hard it is for you (unless perhaps you recently had a stroke) to see the letters that make up this sentence the way an illiterate person might, as confusing geometric patterns; if you’ve learned how to read fluently, you cannot *see* these marks without *reading* them. Maybe one of the secrets of being a good teacher is an ability to swim against the tide of forgetting what it is like to not know, to interrupt the automatic insertion of acquired interpretations, to remember the texture of one’s former mental mush, so that in the classroom one can help other people jell their own mush into the needed cortical structures.

My wife, a psychologist, points out that knowledge overwrites anticipatory imagination: “You can remember what the campus looked like when you arrived at college, but can you remember what you imagined the campus would look like before you saw it?” Of course the answer is no. For that matter, I no longer know how I imagined Susanna Clarke looked before I saw her photograph, or how I pictured Jonathan Strange and Mr. Norrell, the protagonists of Clarke’s debut novel, when my wife and I were listening to the audiobook a decade ago; those mushy imaginings have been replaced by memories of the miniseries that we watched a decade later. The After displaced the Before. I wonder to what extent, years from now, assuming “Piranesi” gets made into a movie or TV series, I’ll be able to remember what the House looks like to me now, in my mind’s eye, in 2020.

The issue of Before versus After comes up not just in pedagogy but also in research. Most of us researchers have “Aha!” moments, when our way of looking at a problem is suddenly transformed, and afterwards we are sometimes tempted to regret all the time we “wasted” stumbling around through the Fog of the Before — forgetting that this stumbling may have been a necessary stage on the way to the Illumination of the After. Mathematician Hermann Weyl wrote about this dichotomy in a passage I encountered in the book “Out of the Labyrinth : Setting Mathematics Free” by Robert Kaplan and Ellen Kaplan.

To begin with, there are definite concrete problems, with all their undivided complexity, and these must be conquered by individuals relying on brute force. Only then can the axiomatizers come and conclude that instead of straining to break in the door and bloodying one’s hands one should have first constructed a magic key of such and such a shape and then the door would have opened quietly, as if by itself. But they can construct the key only because the successful breakthrough enables them to study the lock front and back, from the outside and from the inside.

**OUTSIDE, INSIDE**

*MAJOR SPOILER ALERT: …*

The book closes on a hopeful note, or at least I found it hopeful, because I don’t want to have to choose between my two worlds, and Piranesi learns that he doesn’t have to choose between his. Through a wondrous inversion, Piranesi discovers that the World he has inhabited for so long now inhabits him. He writes:

In my mind are all the tides, the seasons, their ebbs and their flows. In my mind are all the halls, the endless procession of them, the intricate pathways. When this world becomes too much for me, when I grow tired of the noise and the dirt and the people, I close my eyes and I name a particular vestibule to myself; then I name a hall. I imagine I am walking the path from the vestibule to the hall. I note with precision the doors I must pass through, the rights and lefts that I must take, the statues on the walls that I must pass.

I carry a lot of my second world around in my head, and when I make my voyages of discovery to that other world, sometimes a voyage requires no paraphernalia at all, not even pencil and paper. Admittedly, this is the exception rather than the rule; my short-term memory is no more capacious than the average person’s, and usually I need pencil and paper to keep track of where I am and where I’m going. Sometimes I even need to enlist the help of a magical servitor that, while lacking imagination, can obediently carry out clearly-specified tasks too arduous for my limited brain. But in the end, my laptop’s assurances about what it sees don’t satisfy my desire for insight; like the “axiomatizers” in Weyl’s passage, I want to use what my laptop tells me so that I can construct a magic key that allows me to truly *understand* what a brute force computation has merely *verified*. What I find most satisfying, at each journey’s end, is to understand some part of the mathematical landscape so well that I can fit it inside my mind in its entirety, and I can imagine strolling along it, explaining every beautiful part of it to myself or to an imagined Other.

Early one morning, about ten or twenty years ago, I figured something out in my head while my wife was sleeping (in my first world, she was in bed with me and I didn’t want to risk waking her, so I had to forego pencil and paper and figure out a sequence of mental handholds that would get me to where I wanted to go in my second world). I was proud of my discovery but was chagrined when, a few years later, I found I was unable to reconstruct my thought process. From time to time I’ve returned to the problem and failed to reconstruct what it was that seemed so clear to me back then. Don’t get me wrong: the claim is true, and I can prove it. I just can’t *see* it the way I once did.

Maybe it’s time for me to set aside pride and ask one of my math-friends, one of my fellow Children of the Labyrinth, to help me find again the path that seemed so straight to me before.

The Beauty of the House is immeasurable; its Kindness infinite.

*Thanks to Jeremy Cote, Sandi Gubin, Joe Malkevitch, and Evan Romer.*

Next month: My Life with Aztec Diamonds.

]]>Adults are like that too. Being told what we can’t do takes us back to the time when we were powerless children, and sometimes we grownups respond to prohibitions in childish ways. Consider how many supposedly grown-up people have tantrums when they’re told they can’t enter a certain establishment unless they’re wearing a face mask! I sometimes wonder whether I’ve really matured as much as my change in station over the past half-century (from snotty pre-teen to tenured professor) would indicate; maybe I only seem more mature because, in my present life circumstances, fewer people tell me what I can’t do.

**SQUARING THE CIRCLE**

Among the adults who don’t like being told “You can’t do that” are many adults who enjoy math as a hobby, and the most common thing they’re told they can’t do is square the circle. Squaring the circle is the problem of constructing a square with the same area as a given circle, using only straightedge and compass in the classic Greek manner. (A straightedge is a ruler with all the markings removed, but by way of compensation, it can be as long as you need it to be. A compass is a tool for poking other kids in geometry class when the teacher’s back is turned.) Telling people “But it’s been proved that you can’t square the circle!” often proves to be an irresistible lure, and mathematicians regularly receive correspondence from strangers claiming to have found a solution.

David Richeson’s new book “Tales of Impossibility: The 2000-Year Quest to Solve the Mathematical Problems of Antiquity” (Princeton University Press, 2019) is devoted to the history of squaring the circle and three related problems: trisecting the angle, doubling the cube, and constructing (most) regular polygons. This well-written amply-illustrated book won’t fix the problem of amateur mathematicians insisting that they’ve solved one of the problems, because the people who most need to read the book either won’t read it or will leaf through it without understanding it. But many others will find it an enjoyable and informative read and a stunning illustration of the power of reason. (For a taste of Richeson’s writing, read his article “When Math Gets Impossibly Hard”.)

Modern circle-squarers have an illustrious forerunner in the person of the philosopher Thomas Hobbes, who believed to his dying day that he’d squared the circle. But the situation in Hobbes’ day was different: squaring the circle had not yet been proved impossible. So although each construction Hobbes proposed was wrong (as his combative correspondent the mathematician John Wallis was quick to point out), it was reasonable of Hobbes to hope that a workable construction existed and that perseverance would disclose it.^{2} Nowadays we know better, but it’s important to be precise about what it is that we know, because it’s easy to misunderstand what the claim of impossibility says *and* what it doesn’t say.

The proof of the impossibility of squaring the circle hinges on the subtle issue of precisely what sorts of geometrical constructions are permitted. We need to specify not just the tools that are to be used (straightedge and compass) but the *way* in which they are to be used.^{3} What nineteenth century mathematicians proved is that *if* one restricts oneself to using straightedges and compasses in certain specified ways, *then* certain geometric constructions are impossible. The “if … then …” nature of the claim is in keeping with the contingent character of pure mathematics, as deftly described by Clarence Wylie in the poem that concluded my first Mathematical Enchantments post.

The proofs of the four impossibility theorems Richeson discusses aren’t easy, but in philosophical essence they’re not that different from the claim that it’s impossible to find two even numbers whose sum is odd. If we agree on the meaning of “even” (an integer is even when it can be written as twice an integer), then two even numbers, say 2*m* and 2*n*, have sum 2*m*+2*n*, which (being equal to 2(*m*+*n*)) is again an even number. It would be foolish to object “But there are infinitely many even numbers to try, and you’ve only considered finitely many of them; how can you be sure someone cleverer than you won’t someday find two even numbers whose sum is odd?”

**FROM GEOMETRY TO NUMBERS**

How did mathematicians prove that squaring the circle is impossible? By turning it into a statement about operations on numbers.

If we’ve got a line segment of length 1 (call it *AB*), we can use it as the radius of a circle whose area would be *πr*^{2} = *π *1^{2} = *π*. If we start doing straightedge and compass operations, we can construct new points all over the place, but if you study them closely you’ll find a numerical pattern governing all those points: the distance between any two of them is a number that can be derived from the number 1 using only the operations of addition, subtraction, multiplication, division, and square roots, or what’s called a constructible number.^{4} So if at the end of the construction there are four points *WXYZ* forming a square of side-length *s*, *s* will have to be a constructible number. On the other hand, if the square has area *π* (the area of the original circle), the side-length *s* will have to be the square root of *π*. (For more on the square root of *π*, see my earlier blog post.) So if there were a way to square the circle in the Greek manner, the square root of pi would have to be a constructible number.

Turning this around, if we knew that **√***π weren’t *a constructible number, that is, if we knew that **√***π* *can’t* be obtained from the number 1 using only +, −, ×, ÷, and **√**, we’d know that the circle *can’t* be squared in the Greek manner. In short, proving the impossibility of a geometric construction (squaring the circle) can be reduced to proving the impossibility of an arithmetic construction (constructing **√***π *from 1 using only +, −, ×, ÷, and **√**). And that’s how mathematicians settled the ancient problem.

**FIVE IMPOSSIBLES**

But how did mathematicians show that **√***π *and *π* aren’t constructible? To answer this (at least in outline), I present and discuss a graded sequence of impossibilities.

#1. You can’t arrive at the number 99 by adding 2’s together (and performing no other operations), no matter how many 2’s you add.

#2. You can’t arrive at the number 1/3 by adding and subtracting finitely many fractions whose denominators are powers of ten, or if you prefer decimals to fractions, finitely many terminating decimals.

#3. You can’t arrive at the square root of 2 through finitely many operations of addition, subtraction, multiplication, and division starting from the number 1.

#4. You can’t arrive at the cube root of 2 through finitely many operations of addition, subtraction, multiplication, division, and extracting square roots starting from the number 1. (That is, the cube root of 2 is not constructible.)

#5. You can’t arrive at pi through finitely many operations of addition, subtraction, multiplication, division, and extracting square roots starting from the number 1.

The truth of #1 is pretty clear: if you draw a highly exclusive number line with only even integers marked on it, there’s a big hole where 99 belongs, and if you’re playing the adding-2’s game you can stop short of the hole or you can leap over it but you can’t *arrive at it*.

The truth of #2 is similar in spirit, but subtler. Imagine that our snooty number club has decided to extend membership not just to all the formerly-excluded odd integers but to the terminating decimals as well: 0.3, 0.33, and the like. But 1/3 isn’t equal to 0.3, or 0.33, or 0.333, etc. It isn’t equal to any terminating decimal (or to any number expressible as a sum or difference of terminating decimals, because the result is just another terminating decimals). You could say that the set of terminating decimals has a hole where 1/3 would be. The difference between the hole at 99 in scenario #1 and the hole at 1/3 in scenario #2 is that in scenario #2 the hole doesn’t have a zone of unreachability around it. Terminating decimals will get you as close to 1/3 as you like (.3, .33, .333, etc. to the left of 1/3, and .4, .34, .334, etc. to the right of 1/3), but if only finitely many digits are allowed, your terminus will be only be an approximation to 1/3, not the exact number 1/3.

Moving on to #3, notice first what you *can* build up from the number 1 by the permitted operations: counting numbers (1+1=2, 1+1+1=3, etc.), zero (1−1=0), negative numbers (0−1=−1, 0−2=−2, etc.), and fractions (2/3, etc.). You can imagine that the number-line club has decided to relax its membership requirements again and now admits 1/3 and all the other rational numbers. But since the square root of 2 is irrational^{4} you can’t arrive at the square root of 2 in this way. Some of the new numbers like 7/5 and 17/12 give something close to 2 when you square them, but none of them square to exactly 2.

Impossibility #4 is related to one of the problems Richeson writes about, namely the problem of doubling the cube. Just as squaring the circle can be reduced to the problem of determining whether *π *is constructible, doubling the cube can be reduced to the problem of determining whether the cube root of 2 is constructible. You can imagine that the number-line club has expanded its membership yet again, to allow the square root of 2 to join, along with lots of other numbers. Every positive number that’s a member is welcome to extend the invitation to its square root, and every new member is given that same license! But there are still holes in this more inclusive version of the number line, and it can be proved that one of those holes is where the cube root of 2 sits. If you just want to *approximate* the cube root of 2, there’s a nice way to do it with error that becomes as small as you like, but if you’re only allowed finitely many operations, there’s no way to hit that number on the nose.

Finally we get to impossibility #5. Is pi in the inclusive club of constructible numbers or isn’t it? In attempting to find the answer, we’re led to a disconcerting question: what is this *π* number anyway? I mean, even if we don’t know what the cube root of 2 is numerically, we know what numerical property it’s supposed to have: when you cube it, you’re supposed to get 2. But what property distinguishes the number that *is* *π* from the infinitely many numbers that *aren’t* *π*? We know the geometrical meaning of *π* as the ratio of a circumference to a diameter, but what do we know about the number *π*?

Here the story takes a long detour through calculus, and I’m not going to give the details. Read Richeson’s book (and the references he provides) if you want to know more!

**MARKETING IMPOSSIBILITY**

Now we come upon an issue of public relations. It’s one thing to say “The sum of two even numbers is always even”, and another thing to say “You can’t find two even numbers whose sum isn’t even”. Even if they’re logically equivalent, they’re not psychologically equivalent: the latter assertion throws down a gauntlet.

I doubt anyone has ever spent much time trying to find two even integers whose sum is odd, because the reasons for the impossibility are pretty clear; but when the reasons are more intricate, as is the case for the four problems Richeson treats, and when the word “impossible” is used, people who relish a challenge are more prone to take the bait. And once a person has committed to the position that the mathematical establishment is wrong, it may be hard for them to back down with their pride intact.

Meanwhile, people who say that something is “impossible”, even when they’re in the right, may find themselves linked in people’s minds with all the fools who made the classic blunder of saying that such-and-such is impossible only to be proved wrong an embarrassingly short amount of time later. Yes, the most famous is “Heavier-than-air flying machines are impossible” (Lord Kelvin, 1895) but there are many others. Of course math is different from aeronautics and other sciences because proofs in mathematics have a kind of rigor not attainable in other fields, but not everybody understands that.

Labeling these results “theorems of impossibility” may not be the best look for the mathematical profession. I prefer to describe them as “theorems of necessity”, inasmuch as they assert that *if* you want to square the circle (say), *then* it’s necessary that you broaden the notion of construction that you’re allowing yourself. Come to think of it, the ancient Greek mathematicians didn’t limit themselves to what we moderns call “geometric constructions in the Greek style”; for instance, Archimedes, not limiting himself to the lines and circles that straightedge and compass afford, figured out a way to solve the angle-trisection problem using spirals.

I’m suggesting that theorems of impossibility should be recast in more positive form, so that the assertion becomes an invitation to creativity rather than a door slammed in one’s face. As a perverse exercise, I invite you to take some attractively positive result and recast it in a negative vein. The two assertions may be logically equivalent, but they can feel very different!^{5}

I’m not saying that every proof of impossibility can be recast in a more positive form, but I think it’s worth trying. For instance, instead of saying “These properties of the real number system show that it’s impossible for the square of a number to be -1”, we can say “If we’re going to have a number system in which -1 has a square root, we’ll have to drop one of the following properties of the real number system.” Likewise, instead of saying “It’s impossible for the angles of a triangle to add up to less than 180 degrees”, we can say “If we want a geometry in which the angles of a triangle can add up to less than 180 degrees, then …” and then try to figure out how to finish the sentence.

The story of geometry isn’t finished, and you don’t need to understand the 19th century impossibility proofs to find new results in Euclidean geometry. Although the geometric topsoil has been pretty thoroughly turned by past prospectors, there’s still gold waiting to be found, and proofs of impossibility can steer you towards the gold by steering you away from the places where there isn’t any.

Nonetheless, trisectors and circle-squarers and the like aren’t going to go away, and that’s a good thing. I propose that mathematical crankery ought to be encouraged among certain people as a way of channeling their latent susceptibility to outlandish beliefs; crazy notions about circles and circumferences are less harmful to society than crazy notions about pederasts and pizza parlors.^{6} Nor am I limiting my advocacy of mathematics-as-pacifier to the pacification of mathematical amateurs. I wish that one-time professional mathematician Ted “Unabomber” Kaczynski, instead of trying to change the world by killing people, had gone the way of Michael Atiyah, a truly great mathematician who at the end of his life came to mistakenly believe he’d proved the (still-unproved) Riemann Hypothesis. Atiyah’s delusions made him a happy man in his final days and hurt no one. Not all delusions are so harmless.

*Thanks to Sandi Gubin, Joe Malkevitch, David Richeson, Evan Romer, and Stan Wagon.*

Next month: Children of the Labyrinth.

**ENDNOTES**

#1. The song is from the musical “The Fantasticks”, which in turn is based on the 19th century play “Les Romanesques” by Edmond Rostand; both demonstrate the way prohibitions can backfire. A much earlier example comes in the Book of Genesis. Are we sure that Adam and Eve would have even noticed the Tree of Knowledge of Good and Evil if God hadn’t pointed it out to them, saying “Now whatever else you eat, *don’t eat that*“?

#2. For more on the Hobbes-Wallis conversation, see Martin Gardner’s essay “The Transcendental Number Pi”, chapter 8 in Gardner’s “New Mathematical Diversions”. Especially memorable is a quote from Hobbes that Gardner includes: “All you have said is error and railing; that is, stinking wind, such as a jade lets fly when he is too hard girt upon a full belly.” Hobbes was the more colorful writer, but Wallis was right on the math. Neither convinced the other. The correspondence has lessons for the age of Twitter.

#3. The last proviso matters because if you’re allowed to *mark* your straightedge, then the angle trisection problem ceases to be impossible and has a solution known to the ancient Greeks.

#4. Of course this assertion requires proof! I’m glad this is a blog and not a textbook, so I get to leave things there.

#5. As an example, I’ll “negativize” the sexy Banach-Tarski paradox. It’s sometimes couched in the form “You can divide a solid ball into a finite number of pieces and then reassemble those pieces to form two solid balls of the exact same size as the original” (though when it’s stated this way, I want to shout “No I can’t and neither can you!”). One could phrase this attention-grabbing positive claim in a negative fashion, asserting the nonexistence of a notion of “volume” that’s invariant under rotation and additive under finite dissection. This is equivalent to the usual statement (although the equivalence is trickier than you might think). But would this be a good way to sell it? I don’t think so.

#6. A friend who read an early draft of this essay suggested that the set of people who do kooky mathematics may be mostly disjoint from the set of people who engage in kooky politics (leaving aside Dr. Shiva Ayyadurai’s recent dabbling in election numerology). I asked David Richeson about the people who write back to him to share their trisections etc., and he said: “As for your question from earlier today—yes, I’ve definitely seen an uptick in crankish emails. But I have not been inundated. They do have some interesting variety. Some of them come by mail instead of email—with photocopied pages of complex geometric drawings. Some of them (quite a few of them) are written by people with advanced degrees. Some are degrees like psychology or medicine. Some are engineers. A lot of the people say that they’ve been working on these problems for years. I don’t know that I’ve received any who have explicitly said that my book is wrong. Rather, they just want to share their discoveries. They think I’ll be pleased to have learned that these problems are not, in fact, impossible. I have not gotten any like Woody Dudley writes about—by people who don’t want to share their solutions because they could be moneymaking ideas.” Here Richeson is referring to the books “Mathematical Cranks” and “The Trisectors” by Underwood Dudley.

]]>The mistaken formula (*x*+*y*)^{2} = *x*^{2} + *y*^{2} is sometimes called the First Year Student’s Dream, but I think that’s a bad name for three reasons. First, (*x*+*y*)^{2} = *x*^{2} + *y*^{2} is not exactly a rookie error; it’s more of a sophomoric mistake based on overgeneralizing the valid formula 2(*x*+*y*) = 2*x *+ 2*y*. (See Endnote #1.) Second, most high-school and college first-year students’ nocturnal imaginings aren’t about equations. Third, the Dream is not a mere dream — it’s a visitor from a branch of mathematics that more people should know about. The First Year Student’s Dream is a formula that’s valid and useful in the study of *fields of characteristic two*.

**FIELDS: THE BETTER NUMBER SYSTEMS**

What’s a field? At the most non-technical level, a field is a playground where things that behave like numbers get to romp under the influence of operations that behave like addition, subtraction, multiplication, and division. The Wikipedia page for fields has a more technical definition. The most famous fields are the rational number system (ℚ), the real number system (**ℝ**), and the complex number system (ℂ); they’re the envy of many other number systems because of how nicely subtraction and division work out in those three realms. The set of integers (ℤ) doesn’t get to be a field (sorry, ℤ!), even though it tries really hard. ℤ‘s got the whole subtraction game down cold, but when you divide one integer by another (nonzero) integer, the answer isn’t always an integer.

We often write the operations in a field using the conventional symbols +, −, etc. even when the elements of the field aren’t numbers in the ordinary sense. Every field has two special elements called 0 and 1 that behave a lot like the familiar numbers 0 and 1; for instance, the formulas *x *+ 0 = *x* and *x *× 1 = *x* and *x *× 0 = 0 are valid for all elements *x* of the field. (See Endnote #2.) Here’s an example of field with just three elements, called *a*, *b*, and *c*, with the operations of addition and multiplication defined by tables:

In this field, *a* is the 0-element (because *x* + *a* = *x* for all *x* in the field) and *b* is the 1-element (because *x *× *b* = *x* for all *x* in the field), so it’d be better manners to call *a* “0” and to call *b* “1” (and to call *c* “2” while we’re at it).

**THE FIELD WITH TWO ELEMENTS**

What does it mean for a field to “have characteristic two”? It means that in that field, 1+1 = 0 and more broadly *x *+ *x *= 0 for *all* elements of the field. (The three-element field I just showed you does not have characteristic two because 1 + 1 isn’t 0, that is, because *b* + *b* isn’t *a*.) If Noah had lived in a world of characteristic two, he would have been extremely vexed when trying to load his ark: every time he paired up two animals in preparation for boarding, they’d mutually annihilate. (But see Endnote #3.) In characteristic two, adding an odd number of 1’s gives 1, while adding an even number of 1’s gives 0.

The smallest field of characteristic two has just 0 and 1 as elements; it’s called 𝔽_{2} (or GF(2)), and its addition and multiplication tables look like this:

The only entry in either of these tables that looks strange is 1+1 = 0; the rest are soothingly familiar. And even 1+1 = 0 might be familiar to you if you’ve seen modular arithmetic; what we’ve called “+” and “×” here are mod-2 addition and mod-2 multiplication. Mod-2 addition is the kind of addition that applies when two wrongs make a right, when the enemy of your enemy is your friend, when cancelling your cancellation of an appointment means you plan to show up after all, and in other real-world situations that I’m hoping some of you will contribute in the Comments.

**THE FIELD WITH FOUR ELEMENTS**

Things get a bit stranger when we move on to the next field of characteristic two, 𝔽_{4}, which has four elements that I’ll write as 0, 1, *α*, and *α*+1. (That letter is intended to be an alpha, though in some fonts it might look more like an a.) Here’s how they play together under the influence of + and ×.

(See Endnote #4.)

What are *α* and *α*+1? They don’t really have meaning, or rather, they don’t have meaning independent of the system 𝔽_{4} they belong to — and 𝔽_{4} only acquires meaning if we play with it (says the pure mathematician) or if we find uses for it (says the applied mathematician) and get comfortable with it. If the analogy helps, you can think of *α* as something like the number *i* that we encounter when we progress from the real number system to the complex number system; where *i* has the defining property *i*^{2} = −1, *α* has the defining property *α*^{2} = *α* + 1.

Speaking of −1, if I’d given you the subtraction table for 𝔽_{4}, you would have noticed that the subtraction table is the same as the addition table! In characteristic two, every element *x* satisfies *x *+ *x *= 0, so every element is its own additive inverse; and −*x* = *x* implies that subtracting *x* is the same as adding *x*.

This brings us back to the First Year Student’s Dream. In both ordinary algebra and the characteristic-two kind, (*x*+*y*)^{2} = *x*^{2} + *xy* + *xy* + *y*^{2}, but we handle the repetition of that *xy* in different ways. In ordinary algebra, we collect them to get *xy* + *xy* = 2*xy*; in characteristic two, we cancel them to get *xy* + *xy* = 0.

Once you’ve taught your brain how to dance to the music of characteristic two, the addition table for 𝔽_{4} doesn’t look very mysterious; when you add two elements, the 1’s and *α*‘s cancel in pairs, just like the annihilating animals in the characteristic-two version of Genesis.

The multiplication table for 𝔽_{4} looks wonkier, but every element other than 0 has an alias of the form *α** ^{i}* for some integer

When you multiply elements of 𝔽_{4}, if one of the factors is 0, then the product is 0, but if both factors are non-zero and we write them as *α** ^{i}* and

**ERROR-CORRECTING CODES**

For every number *q* of the form *p** ^{k}*, where

Suppose I want to send digital information over a noisy channel. The information consists of four bits, each a 0 or a 1. The most obvious thing to do is to send each bit once, but the noise on the channel can flip a 0 to a 1 or vice versa, resulting in the recipient receiving the wrong message. What to do? I can’t get rid of the channel noise, but I can try to overcome it by building more redundancy into my transmission.

For instance, I could transmit each bit twice in a row, so that the message 1011 gets transmitted as 11 00 11 11 (where I’ve inserted spaces to make it easier for you to parse the string into pairs). If the recipient knows that I’m using this protocol, and if the channel introduces at most one error, the recipient can detect the occurrence of a mistake. For instance, if the recipient receives 11 00 11 10, and believes that at most one error occurred, then she can conclude that there was a corrupted bit in either the last position or the second-to-last position, so that the transmitted bit-pattern was actually either 11 00 11 00 or 11 00 11 11. But which was it?

To build more resiliency into the protocol, I could transmit each bit three times, so that the message 1011 gets transmitted as 111 000 111 111. If the recipient knows that I’m using this protocol, and if the channel introduces at most one error, the recipient can detect and correct the mistake by using a simple “majority vote” within each triple of bits: if all three bits in a triple agree, there was no corruption of that part of the message by noise; if the bits don’t agree, the bit that disagrees with the other two is the corrupted bit. This protocol is robust under the assumption that at most one of the twelve transmitted bits gets corrupted. That is, our repetition protocol is a *single-error correcting code*.

That’s an effective way to combat error, but one pays a steep price: for every four bits of actual meaningful data, the protocol requires that I transmit eight extra bits. This means that the transmission process will take three times as long as it would have without the repetition. That wouldn’t be a problem if the message really was just 4 bits long, but more likely my message is a movie consisting of gigabytes of data, divided up into 4-bit packets (or, in realistic applications, longer packets), so that increase of transmission time by a factor of three is a major pain.

Fortunately there’s a clever way to get the same single-error-correcting robustness with transmissions that use just three extra transmitted bits instead of eight. It requires the 8-element finite field 𝔽_{8}. Just as the elements of 𝔽_{4} can be written as degree-1 polynomials in *α*, the element of 𝔽_{8} can be written as degree-2 polynomials in some element *β*. Here are the nonzero elements of 𝔽_{8}, along with their aliases:

(*β*^{7} is just 1 again. For more about 𝔽_{8}, see Endnote #5.)

To add elements of 𝔽_{8}, use the left-hand alias and add just as you would ordinarily add polynomials, but with the cancellation rules 1 + 1 = 0 and *β *+ *β* = 0 and *β*^{2 }+ *β*^{2} = 0. For instance, *β*^{2} + *β* plus *β*^{2} + 1 is just *β* + 1 (the *β*^{2}‘s cancel). To multiply nonzero elements of 𝔽_{8}, use the right-hand alias and add the exponents mod 7. For instance, *β*^{4} times *β*^{6} is *β*^{4+6} which is *β*^{3}.

𝔽_{8} gives me a way to pad my four-bit payload with three extra check-bits to get a seven-bit transmission that’s robust against single-bit corruptions. Say that my message bits are *b*_{1}, *b*_{2}, *b*_{3}, and *b*_{4}. Thinking of those 0’s and 1’s as elements of 𝔽_{8}, I form the element *b*_{1}*β*^{6} + *b*_{2}*β*^{5} + *b*_{3}*β*^{4} + *b*_{4}*β*^{3}; it can be rewritten as a degree-2 polynomial in *β*, say *b*_{5}*β*^{2} + *b*_{6}*β*^{1} + *b*_{7}*β*^{0}. If I transmit the bits *b*_{1},*b*_{2},*b*_{3},*b*_{4},*b*_{5},*b*_{6},*b*_{7} and an error occurs in any single one of the 7 positions, the recipient can use 𝔽_{8} arithmetic to detect the occurrence of an error, to diagnose where the error occurred, and to fix the error, obtaining precisely the 7-bit packet I transmitted, whose first 4 bits are the message I was trying to send. (See Endnote #6.)

The data-transmission protocol I’ve just described was invented by Richard Hamming, who didn’t think of it in terms of finite fields. Later researchers in the theory of error-correcting codes figured out that Hamming’s construction, and other, even more powerful ways of building resiliency into digital communication, were related to the mathematics Galois had invented over a century earlier. Nowadays Galois fields play a role not just in making communication noise-resistant but in making it secure from snooping.

I teach college math, so most of my students already have learned not to write (*x*+*y*)^{2} = *x*^{2} + *y*^{2}, and hopefully have actual understanding of what the equation (*x*+y)^{2} = *x*^{2} + 2*xy* + *y*^{2} means and why it’s true. I don’t teach pre-college algebra, so I have no experience helping students transition from (*x*+*y*)^{2} = *x*^{2} + *y*^{2} to (*x*+*y*)^{2} = *x*^{2} + 2*xy* + *y*^{2}. As a teacher I try to find the kernel of truth in students’ wrong answers, so if I were in a high school classroom and someone fell for the First Year Student’s Dream, I might say something like “In abstract algebra, where *x* and *y* don’t count or measure things, there are situations where mathematicians actually do write (*x*+*y*)^{2} = *x*^{2} + *y*^{2}!” before bringing the class back to the mundane world where *x* and *y* are ordinary numbers. But maybe the detour would be distracting or just plain confusing. It might be best to just focus students on the meaning of *x* and *y* and the meaning of (*x*+*y*)^{2}. Even so, I can imagine things going badly. (See Endnote #7.)

*Thanks to Jeremy Cote, Sandi Gubin, Joe Malkevitch, Evan Romer and Glen Whitney*.

Next month: The Positive Side of Impossible.

**ENDNOTES**

#1. The true newbie’s delusion, I think, is that if you take an expression like 2+3×4, it doesn’t matter whether you do the multiplication first or the addition first. After all, 2+3+4 is the same whether it’s (2+3)+4 or 2+(3+4), and 2×3×4 is the same whether it’s (2×3)×4 or 2×(3×4). So you might think that the order of operations in mixed expressions shouldn’t matter either. But since (2+3)×4 = 20 while 2+(3×4) = 14, the first year student learns that order matters.

It’s the sophomore who, having learned the distributive law (*a*+*b*)×*c* = *a*×*c* + *b*×*c*, mistakenly overgeneralizes the underlying principle and slips into thinking that (*a*+*b*)↑*c* = *a*↑*c *+ *b*↑*c* (where I’ve somewhat unconventionally written exponentiation as ↑, for reasons that the next sentence will make clear). It doesn’t help that mathematical convention has us write these formulas as (*a*+*b*)*c* = *a**c* + *b**c *and (*a*+*b*)* ^{c}* =

#2. A comical property of abstract algebra is that part of the definition is 0 ≠ 1. This might seem like a strange thing to see in a math book written for advanced undergraduates; I mean, who arrives at college not knowing that 0 and 1 are different numbers? But in the abstract setting, where the elements of a field might not even be numbers at all, and 0 and 1 could be quite strange beasts, it needs to be said as part of the definition of a field that we require that 0 and 1 not be the same beast.

#3. Of course, in the sort of universe where the formula *a*+*a*=0 rules and everything is one-of-a-kind, the whole idea of sexual reproduction makes no sense, and Noah would need only one creature of each kind anyway.

#4. If you’ve seen modular arithmetic, you’ve seen mod-4 arithmetic, whose operation tables look like this:

The Junior’s Dream (I’m just making up this nomenclature as I go) is that the field with four elements is just mod-4 arithmetic, but this is false. In a field, every element besides 0 has a multiplicative inverse; that is, for every *x* ≠ 0, there’s a *y* such that *x × **y *= 1. But in mod-4 arithmetic, 2 doesn’t have a reciprocal, so mod-4 arithmetic doesn’t give us a field.

What is true is that whenever *p* is a prime, mod-*p* arithmetic gives a field with *p* elements called 𝔽* _{p}*. In fact, the first finite field mentioned in this article (with elements called

#5. The field with 2 elements sits inside the field with 4 elements, so you might think that the field with 4 elements sits inside the field with 8 elements. But that’s what we might call the Senior’s Dream. A finite field with *q* elements sits inside a finite field with *r* elements whenever *r* is a power of *q*, but 8 isn’t a power of 4.

#6. If the recipient receives the bit-string *c*_{1},*c*_{2},*c*_{3},*c*_{4},*c*_{5},*c*_{6},*c*_{7}, then she can use those bits to compute the element *c*_{1}*β*^{6} + *c*_{2}*β*^{5} + *c*_{3}*β*^{4} + *c*_{4}*β*^{3} + *c*_{5}*β*^{2} + *c*_{6}*β*^{1} + *c*_{7}*β*^{0} in 𝔽_{8}; if it’s 0, then no bit was corrupted, and if it’s one of the seven nonzero elements of 𝔽_{8}, then which particular element of 𝔽_{8} it is determines which of the seven positions in the bit-string is the location of the corrupt bit. Once the receiver knows which bit got corrupted, she can correct it, reconstructing the transmitted message, whose first four bits are the actual payload.

#7. Here’s my morbid fantasy about the First Year Student’s Dream. A student comes up to me and says “Teacher, I know you say that (*x*+*y*)^{2} = *x*^{2} + *y*^{2} is wrong, but I don’t see why.”

I reply “Well, let’s try it with numbers. What does (*x*+*y*)^{2} become when we replace *x* by 2 and *y* by 3?”

The student answers “2+3 is 5, so (2+3)^{2} is 5^{2} which is 25.”

“Right!” I say. “And what does *x*^{2} + *y*^{2} become when we replace *x* by 2 and *y* by 3?”

“That’s 2^{2} + 3^{2}, which is 4 plus 9, which is 13.”

“Right!” I say. “A different number.”

The student nods.

“Let’s try drawing a picture,” I say. I draw this picture:

“What’s the area of the square?”

“The side length is *x*+*y*, so the area is *x*+*y* squared.”

“Great!” I say. “Now let’s compute the area a different way by dividing that square up into pieces.” I draw this picture:

“What are the areas of the pieces?”

“Well, there’s an *x*-by-*x* square at the upper left, which has area *x*^{2}, and there’s a *y*-by-*y* square at the lower right, which has area *y*^{2}, and there are two rectangles left over.”

“Great! So when you say *x*+*y* squared equals *x* squared plus *y* squared, you’re on the right track, but you’re leaving out these two rectangles.”

“I see it!” says the student. “Those two rectangles each have area *x* times *y*. So that’s the 2*xy* that I was leaving out.”

“Excellent! So, what have you just learned?”

The student says “I learned that (*x*+*y*)^{2} = *x*^{2} + *y*^{2} isn’t true for numbers and isn’t true for geometry. But I still think it’s true for algebra.”

At this point, I think I start to cry.

**REFERENCES**

John Baylis, Error Correcting Codes.

Elwyn Berlekamp, Algebraic coding theory.

Al Doerr and Ken Levasseur, Applied Discrete Structures.

Raymond Hill, A First Course in Coding Theory.

Steven Roman, Introduction to Coding and Information Theory

]]>

Three months after Nasrudin married his new wife, she gave birth to a baby girl.

“Now, I’m no expert or anything,” said Nasrudin, “and please don’t take this the wrong way-but tell me this: Doesn’t it take nine months for a woman to go from child conception to childbirth?”

“You men are all alike,” she replied, “so ignorant of womanly matters. Tell me something: how long have I been married to you?”

“Three months,” replied Nasrudin.

“And how long have you been married to me?” she asked.

“Three months,” replied Nasrudin.

“And how long have I been pregnant?” she inquired.

“Three months,” replied Nasrudin.

“So,” she explained, “three plus three plus three equals nine. Are you satisfied now?”

“Yes,” replied Nasrudin, “please forgive me for bringing up the matter.”

A trickier example is an old riddle about a missing dollar:

Three guests check into a hotel room. The manager says the bill is $30, so each guest pays $10. Later the manager realizes the bill should only have been $25. To rectify this, he gives the bellhop $5 as five one-dollar bills to return to the guests.

On the way to the guests’ room to refund the money, the bellhop realizes that he cannot equally divide the five one-dollar bills among the three guests. As the guests aren’t aware of the total of the revised bill, the bellhop decides to just give each guest $1 back and keep $2 as a tip for himself, and proceeds to do so.

As each guest got $1 back, each guest only paid $9, bringing the total paid to $27. The bellhop kept $2, which when added to the $27, comes to $29. So if the guests originally handed over $30, what happened to the remaining $1?

It’s absolutely true that 2+27=29, but in the context of the story, adding the numbers 2 and 27 makes no sense, though it would make sense to subtract 2 from 27.

An accountant friend of mine (call him Lenny) used to be in partnership with another accountant (call him Bob). While the partnership was operating, each partner had a “capital account” in the partnership into which they paid their earnings and out of which they paid their expenses. Over time, Lenny’s practice thrived while Bob’s languished, so Lenny’s capital account had a positive balance and Bob’s had a negative balance. Eventually Bob decided to dissolve the partnership. Both agreed that Bob’s negative balance was a problem, but Bob insisted that to set things right it was *Lenny* who needed to make up the negative in *Bob’s* capital account. (And Bob was not joking, which come to think of it may explain why Bob’s accountancy practice was doing so badly to begin with.)

Although these calculations are nonsensical, at least they combine months with months, or dollars with dollars. There are examples of nonsensical calculations in which the units don’t even match up. My favorite examples of such nonsense are the tongue-in-cheek calculations seen on certain road-signs, such as this one from Gold Hill, Colorado:

(Ben Orlin wrote about this sign a few years back.)

Sometimes addition is wrong even when the units match up because the units aren’t the kind of thing that can be added. An amusing example was relayed to me by pre-reader Mark Saul, whose aunt was afraid to use both levels of her double-decker oven at the same time, because “If you have 300 degrees on top, and 300 degrees on bottom, that’s 600 degrees. You could have a fire!”

Sometimes addition is wrong because of what’s left out, as in the practice of breaking down the American electorate into Democrats and Republicans and ignoring the independents (but see Endnote #2). Another mistake happens when you break down a set into smaller sets but ignore overlap between the sets.

Do you have a favorite real-world example of people adding numbers (or more generally subtracting, multiplying, or dividing them) when in fact that calculation isn’t sensible? Please post to the Comments!

The reason I’m bringing this up is that over the summer Twitter saw a lot of discussion of the equation 2+2=4. The equation is seen by some as a touchstone for objectivity and truth and by others as a manifestation of a repressive social system. To read what Keith Devlin has to say about this, read his blog post. To read what Michael Barany has to say about this, read his Twitter thread and the related article. If you want to know what I think, read on!

**APPLES AND ORANGES**

Some people say you can’t add apples and oranges, but of course you can: two apples plus two oranges equals four pieces of fruit. Pictorially:

equals

In the first picture, we’re distinguishing between pomaceous and citrus fruits; in the second picture, we’re ignoring the differences between them. The curves have been redrawn but no fruits have been harmed or moved.

To me, 2+2=4 isn’t a fact about the physical world; it’s more a window through which I view the world, or a channel for my thoughts about the world. It says that the top and bottom pictures show the same state of the world. All that’s changed is the way I’m compartmentalizing things.

Waxing philosophical, I’ll say that the equation 2+2=4 serves as an emblem of the duality between the analytic and synthetic modes of thought. The left-hand side of the equation represents the way we take the world apart; the right-hand side represents the way we put the world together out of the pieces we’ve divided it into. In our attempts to make sense of the world, we need both the kind of thinking that attends to nuances and distinctions and the kind of thinking that can see past those distinctions.

Of course, in saying that both kinds of thinking are needed to help us make sense of the world, I’m presupposing there’s a world for us to make sense of, and this brings us to another philosophical take on 2+2=4, which is that the formula reminds us that *what is is what is*, regardless of whether or how we mentally break it into pieces. In other words, *reality is real*.

I believe that, but I also believe that *reality is really hard to know*. Our prejudices get in the way, and categories of thought that seem neutral in themselves can subtly affect our interpretations of reality. Self-styled champions of the concept of “objectivity” (you can often recognize them because of the way they hype “2+2=4” and say things that amount to “Ha ha, I’m objective and you’re not”) all too often are championing *their* convictions about what’s true, ignoring all the ways in which their experience is partial, their interpretations biased, and their statements couched in language that’s vague and subject to multiple interpretations. We may think that truth is a butterfly and language the net that captures it, but too often truth is a rabbit and language is a mound of jello that we throw at the rabbit, and some of the jello sticks to the rabbit but most of it falls off when the rabbit runs away, and we look at where the jello is and argue about what it tells us, but the jello landed mostly where we threw it, as it was inevitably bound to do, and the rabbit is long gone (and doesn’t eat jello anyway).

**ABSOLUTE TRUTH, AT A PRICE**

Let’s rehash an old line of argument about 2+2=4 and see where it leads us.

It sure *seems* as if 2+2=4 is saying something about the world. But what if I’m counting animals and two of the four make a baby together? Then 2+2 makes 5, right?

“Clever,” you say, “but you know that’s not what I meant; 2+2=4 applies to situations in which you’re combining collections without adding something new.”

To which I reply “What if I’m counting clouds, and two of the clouds merge? Then 2+2 makes 3.”

“Okay,” you say, “but you know that’s not what I meant either; when you combine collections you also have to make sure you’re not blurring the boundaries between things. … Come to think of it, you’re also not allowed to lose any of the items (because that’s the gotcha you were going to try next, am I right?). If all those conditions are met, then the new collection will contain four objects.”

But then I ask you to explain what you mean by “blurring the boundaries”. When two clouds seem to merge, that may be a trick of perspective. They may be at different altitudes, and only seem to merge because of where I’m viewing them from.

This kind of back-and-forth can go on for a long time. It might seem as sophomoric as dorm-room arguments about solipsism (“How do we really know anything? How do I know I’m not a lonely ghost in a void?”), but 20th century philosophers wrestled with the metaphysics of addition in a serious way. The more you explain what you mean by saying that 2+2 equals 4, and the more I raise pesky objections, and the more you counter my objections by hemming in the assertion with qualifications that remove loopholes, the less 2+2=4 looks like a statement about reality and the more it looks like a statement about how we look at the world, the interpretations we make, and the rules we apply in mentally dissecting and reassembling the world.

Take this to extremes, and you find 2+2=4 to be a statement that’s not about the world at all, but about how the mind perceives and categorizes. Now take one crucial step further and remove any explicit acknowledgment of Mind, so that all that’s left is the possibility of a Mind and the possibility of a World, and the truths of math seem to survive as ghostly sorts of constraints on possible minds *—* ways of saying “Well, if there *were* a world, and there *were* a mind that tried to understand that world by cutting it into pieces and putting the pieces back together again, here are the kinds of experiences that that mind would have.”

This non-place we’ve arrived at is a strange place to be, but it’s where I do most of my work. Some call it Plato’s realm of pure form, and some even think it’s even more real than ordinary reality, though that sounds crazy to me. But I can see the appeal of the conceit. The realm feels like a place to me, with stable features that persist from one visit to the next. It’s something like the ghostly void of the solipsist fantasy, but equipped with furniture to bump into. In this realm, or at least the sub-realm first mapped by Giuseppe Peano, it’s a fact that 2 is 1+1 is (or rather a definition; 2 is defined to be 1+1), it’s a fact that 3 is 2+1 (again, by definition), and it’s a fact that 4 is 3+1 (once again, by definition), and armed with these facts we can prove 2+2 = 4:

2+2 = 2+1+1 = 3+1 = 4

(If you want a longer proof, you can insert an extra step where the associative property of addition is exploited, but if you’re the kind of person who knows about the associative property then you probably didn’t need me to tell you that. And if you know that Peano didn’t actually define 2 as 1+1 but rather defined 2 as “the successor of 1”, and similarly for 3, 4, etc. … well, then you don’t need me to tell you that either.)

So we’ve found a realm in which 2+2 is absolutely 4. What else is there to say?

It turns out there’s a lot more to say. Because even if the Platonic realm is (as some claim) more real than we are, we only know it through our finite and fallible human minds. “2+2 = 4” became human knowledge through historical/social/psychological processes (how else could humans have come to know it?), and whenever people come into a story the story gets complicated.

**MATH, COMMERCE, AND CAPITAL**

One way to bring people into the story is to consider where the symbols^{3} in “2+2=4” came from, and how. The 2’s and the 4 are Indian-Arabic numerals, brought into Europe for commercial purposes in the late Middle Ages. One of the original Arabic treatises on the decimal system, Muhammad ibn Musa al-Khwarizmi’s “The Hindu Art of Reckoning”, gave us the word “algorithm” as a latinization of the author’s name. Sometimes it’s claimed that the new algorithms won out in Europe because they were more efficient for calculations than the abacus methods that preceded them, but that’s not entirely clear. What is clear is that calculations done in writing using the new symbols were more auditable than the evanescent motions of sliders on a rack or tokens on a board, and that this auditability helped trading companies expand their operations to ever-larger regions of the globe.

Historians of mathematics agree that the explosion of trade in the late Middle Ages and early Renaissance played a major role in spurring the development of mathematics. Mathematics in turn made industrial capitalism possible, by streamlining the ways in which the flow of capital and labor could be regulated. Regardless of whether you think capitalism is “good” or “bad” (I think it’s both), it’s important to recognize the economic aspect of 2+2=4 as part of its social history.

Let me acknowledge some personal bias: I *like* money. And I don’t mean that I like to *have* money (though that’s true too). What I mean is, I’m glad that I live in a society that has a universal medium of exchange, because I’d find barter overwhelmingly complicated. Which brings me to a story about a fateful 19th-century barter between a European and an African whose reverberations are still with us.

**THE GENTLEMAN AND THE SHEPHERD**

The famous Charles Darwin had a famous (and nowadays also infamous) cousin, Francis Galton. Like Darwin, Galton did a fair bit of traveling. In his 1853 book “Tropical South Africa”, describing his travels among the Damara people in what is now Namibia, Galton wrote:

When bartering is going on, each sheep must be paid for separately. Thus: suppose two sticks of tobacco to be the rate of exchange for one sheep, it would sorely puzzle a Damara to take two sheep and give him four sticks. I have done so, and seen a man first put two of the sticks apart and take a sight over them at one of the sheep he was about to sell. Having satisfied himself that that one was honestly paid for, and finding to his surprise that exactly two sticks remained in hand to settle the account for the other sheep, he would be afflicted with doubts; the transaction seemed to come out too “pat” to be correct, and he would refer back to the first couple of sticks, and then his mind got hazy and confused, and wandered from one sheep to the other, and he broke off the transaction until two sticks were put into his hand and one sheep driven away, and then the other two sticks given him and the second sheep driven away.

Stories like this are part of a long tradition in which Europeans depicted Africans and other foreigners as being ignorant or stupid. Even when Galton acknowledges differences between one African and another, he attributes those differences to genetic causes (“The Damaras were for the most part thieving and murderous, dirty, and of a low type; but their chiefs were more or less highly bred”). Galton went on to found the eugenics movement, a program of conscious breeding among upper class Brits that might have ended up a quaint footnote in the annals of English dottiness but for the fact that its poisonous ideology of racial difference found fertile soil in the post-Civil War United States, Hitler’s Germany, and elsewhere.

I learned this story from an essay by Michael Barany, listed in the References. One of the points Barany makes is that we never get to hear the shepherd’s side of the story. Meanwhile, Galton’s account is suspiciously omniscient; in particular, I’m struck by the phrase “his mind got hazy and confused”. Really? How could Galton know what was going on in the shepherd’s mind? I’m often puzzled in the classroom by things my students say, but I don’t pretend to know what’s going on in their minds. Sometimes I discover that a student’s wrong answer is the right answer to another question, and that my question wasn’t completely clear *—* and that’s in a classroom in which the teacher and the students are all speaking the same language! Galton had to rely on interpreters and his own guesses, and he may have missed what was really going on.

Consider that, for a shepherd, a flock is a collection of individuals, more like 1+1+1+1 than 2+2 or 4. The value of a sheep as measured in sticks of tobacco might vary from sheep to sheep. More importantly, value might not be what economists call additive. For instance, a female sheep is probably worth more than a male (as we might say in symbols nowadays, F > M), but two female sheep are probably worth less than a breeding pair (F + F < M + F), which is a mathematical contradiction if we assume value is additive.^{4} Galton’s assumption that any shepherd who’s willing to trade two sticks of tobacco for one sheep *must* (if he’s rational) be willing to trade four sticks of tobacco for two sheep makes sense in the context of a mercantile economy based on interchangeable goods, but doesn’t fit so well with a barter economy based on goods that are far from identical. And even when goods are identical, modern economists recognize that value can behave nonadditively. Here’s my favorite example: If you’re willing to sell me one of your kidneys for two million dollars, does it follow that you’re willing to sell me *both* your kidneys for *four* million dollars?

Galton also derided the shepherd for being unwilling to start the second transaction until the first was completed. But in fact there are pitfalls associated with making two overlapping transactions, notably the “change raising scam”, which gets its punch from cognitive overload; when there’s a lot going on, the person operating a cash register may forget that a bill on the counter is supposed to be on its way from the customer’s pocket to the register and not the other way around.

Even smart people can be fooled by the con, and more importantly, most people are unaware that they can be conned in this way. For all their education, modern urbanites in the English-speaking world usually lack metacognition about their ability to be fooled when their short-term memory is being overtaxed.

In comparison, that shepherd could be viewed as a metacognitive sophisticate, wisely separating two transactions rather than trying to combine them!

**NEWER CONS**

The big cons going on these days involving true-but-irrelevant math are hidden from view inside computers, or should I say, inside black-box algorithms. When I was young, the word “algorithm” meant a procedure for operating on numbers (think: long division), and later on when I studied computer science, some of the basic algorithms I got to know were procedures for *sorting numbers*, with cute names like QuickSort, HeapSort, MergeSort, and BatchSort. You could go into a lecture with no previous knowledge of such an algorithm and emerge an hour later with a clear understanding of why the algorithm always found the right answer and how long it took to find it.

Nowadays the term “algorithm”, when used outside of academia, mostly refers to procedures for compressing information, such as the procedure that takes your entire Netflix viewing history (along with dozens of other facts about you and an enormous number of facts about movies) and distills it down to a recommendation for what movie you’d enjoy watching next. You can’t get an intimate knowledge of these algorithms in an hour, a day, or a year. These algorithms aren’t humanly understandable because they weren’t created by humans; they were created (or at least tuned) by computers through a process called machine learning. It may not make obvious sense to add your zip code to your age, but if a learning algorithm finds that this sum is predictive of what movies you’ll watch, that’s what it’ll use.

Algorithms, in the new sense of the word, aren’t perfect (it’s not even clear what constitutes the “right” answer), but imperfection isn’t always a big problem. If a movie recommendation engine picks a movie you don’t like, you can always switch to another movie. The real problem comes when algorithms of this new kind are used not for recommending movies but for *sorting people*, deciding who is credit-worthy, college-worthy, job-worthy, or parole-worthy. Guess who tends to benefit from these algorithmically-driven decisions: those who already have lots of social privilege or those who don’t?

When a people-sorting algorithm makes a mistake (that is, outputs an answer that most observers agree is wrong), it isn’t easy to track down where the algorithm went astray because it’s so inhumanly complicated. And that’s assuming that the owners of the algorithm are willing to open the hood, which is usually not the case. It takes a big outcry (like the recent International Baccalaureate scandal) to force the owners of the algorithm to open that hood and let the world peek inside. And even when there’s an outcry, there’s a tendency to view the algorithms as authoritative, because aren’t the algorithms based on math? And how can math be wrong?

I mentioned before that a big advantage of the algorithms introduced in Europe in the late Middle Ages was their auditability. How ironic that 21st century algorithms turn back the clock and return computation to the shadows!

Most victims of algorithmic injustice are powerless individuals, unaware of each other and sometimes unaware of the nature of their victimization. This problem is likely to get worse, not better, in years to come, in education and elsewhere in society. And unlike a miscalculated Netflix recommendation, this is not a movie we can just stop watching.

**SOCIAL MEDIA**

I talked about how the story of the equation 2+2=4 intersects with the story of commerce and capitalism and then I talked about how it intersects with racism, eugenics, and genocide. So we’re not on purely mathematical turf anymore, and you can imagine how battles over 2+2=4 could happen on social media as a proxy for battles over real-world issues.

Many people I sympathize with politically stake out positions in these battles I don’t find very convincing. Some of them try to find contexts in which 2+2=5, but it always involves subverting the common shared meaning of 2+2=4 (a subversion that they sometimes admit to and sometimes don’t). For instance, it’s absolutely true that if we round the numbers in the equation “2.3+2.4=4.7” to the nearest whole number, we indeed get “2+2=5”. But that doesn’t mean 2+2=5.

I can play this game too. In music theory the interval of a 2nd (a step up from F to G) plus the interval of a 2nd (a step up from G to A) yields the interval of a 3rd (from F to A). But does that imply “In music theory, 2+2=3”? I don’t think so.

Sometime soon I’ll tell you about a context in which mathematicians write 1+1=0, and another context in which they write 1+1=1. But these are different number systems than the ones you learned about in school. In these number systems, the symbols “0”, “1”, “+”, and even “=” can have a different meaning than in the standard context, and mathematicians don’t pretend otherwise. And even in these bizarro arithmetics, “2+2=4” is still true (though in one of the arithmetics 4=5 so 2+2=5 is also true).

Interestingly, math researchers on Twitter were on the whole fairly sympathetic to challenges to 2+2=4. When you spend years training your imagination to make sense of the bizarre and counterintuitive, and someone says “There are contexts in which 2+2 isn’t 4”, part of your mind goes “Hmm, I wonder what such a context might be?”

In my reading on equity, mathematics, education, etc., one comment I found trenchant (from the book “Ethnomathematics: Challenging Eurocentrism in Mathematics Education”) is the juxtaposition of the true assertions “2 apples plus 2 pears equals 4 fruits” and “2 pants plus 2 jackets equals 2 suits”. This example demonstrates how real-world knowledge can invisibly pervade a math problem, making mathematical content accessible to those who have the real-world knowledge and inaccessible to those who lack it. It makes me think harder about what knowledge my students bring into the classroom.

I’m unmoved by slogans like “Western mathematics is a tool of cultural imperialism”, though I’m sympathetic to concrete critiques of specific teaching practices that I believe are the wellspring of the math-as-imperialism ideology voiced by some math educators.^{5} At the same time, I’m aghast at the sort of sexist and racist tropes that have been flung at proponents of critical theory by self-proclaimed defenders of “objectivity” when the proponents are women or people of color or both. I’ll take ideology over bigotry any day. In a profession that’s still too dominated by white men, I’m willing to round down some overheated rhetoric.

I think the people who voice concern about 2+2=4 being a cultural imposition should be more worried about the algorithms that use valid equations like 2+2=4 in invalid ways to create inequitable outcomes. To be empowered to challenge those algorithms, students need an understanding of the mathematics of technocracy, not the mathematics that their ancestors used.

**SUMMING UP**

I said at the beginning that 2+2=4 is seen by some as a touchstone for objectivity, and by others as a manifestation of a repressive social system. And now you know that I think both are right. The mercantile-capitalist-technological civilization that gave us the mathematics of the modern era has done wonderful things and terrible things. Most scientific knowledge was discovered by flawed people, working in flawed institutions that excluded many people from contributing, situated within economic systems that exploited many people at home and abroad. There’s no way to purge this knowledge of its origins. We can and must try to do better, but that doesn’t mean we throw away what we’ve learned.

If the equation 2+2=4 can be viewed as a story of accumulation and wealth-building, its companion 4=2+2 tells a different story, of equitable sharing. Likewise, if 4=2+2 seems to tell a story of divisiveness, of a society coming apart into factions, 2+2=4 tells a story of people coming together. We need to find better ways of dealing with difference, cooperating, and sharing. Properly applied, the mathematical knowledge that our species has acquired, including 2+2=4, can bring us closer to a world in which all can share the fruits (the apples *and* the oranges) of progress.

*Thanks to Michael Barany, Keith Devlin, Sandi Gubin, Brian Hayes, David Jacobi, David Merfeld, Ben Orlin, Evan Romer, and Mark Saul.*

Next month: When 1+1 Equals 0.

**ENDNOTES**

#1. I first heard this story as a Jewish tale about the citizens of a city of fools called Chelm, and my guess is that versions of the story exist in other cultures as well.

#2. Although a voter can’t be registered with more than one party, a candidate can be endorsed by more than one, as for instance happened when Earl Warren, back before he served on the U.S. Supreme Court, ran for Governor of California and won endorsements from both major parties and some minor ones as well.

#3. The symbols “+” and “=” came centuries after “2” and “4”. “+” was introduced in Europe in the 1400s as an abbreviation of the Latin word “et” (meaning “and”). The equals sign was invented by Robert Recorde in the 1500s as a deliberate coinage, consisting of twin lines of the same length “because no two things can be more equal”.

#4. Thomas Aquinas similarly argued that just because an angel is better than a stone, it doesn’t necessarily follow that two angels are better than one angel and one stone. Perhaps he should be hailed as the patron saint of diversity.

#5: Often you’ll see titles like “Western mathematics *as* a tool of cultural imperialism”; the “as” can be read either as “is nothing more than” or “is in some respects”, according to one’s audience, and I think the ambiguity is intentional. Maybe someone should write an article called “‘*As’* As Multivalent Signifier”?

**REFERENCES**

Marcia Ascher and Robert Ascher, Ethnomathematics, in “Ethnomathematics: Challenging Eurocentrism in Mathematics Education”.

Michael Barany, One, Two, Many: The Prehistory of Counting. https://www.newscientist.com/article/mg21028081-500-one-two-many-the-prehistory-of-counting/

Meredith Broussard, When Algorithms Give Real Students Imaginary Grades, https://www.nytimes.com/2020/09/08/opinion/international-baccalaureate-algorithm-grades.html

Keith Devlin, Of Course, 2+2=4 is Cultural. That Doesn’t Mean the Sum Could be Anything Else. https://www.mathvalues.org/masterblog/of-course-2-2-4-is-cultural-that-doesnt-mean-the-sum-could-be-anything-else

Cathy O’Neil, Mutant Algorithms Are Coming for Your Education, https://www.bloomberg.com/opinion/articles/2020-09-08/mutant-algorithms-are-coming-for-your-education

Cathy O’Neil, Weapons of Math Destruction. https://weaponsofmathdestructionbook.com/

Ben Orlin, The Smartest Dumb Error in the Great State of Colorado. https://mathwithbaddrawings.com/2015/08/19/the-smartest-dumb-error-in-the-great-state-of-colorado/

]]>To get a clearer sense of what counts as a good answer, let’s consider a bad answer. You *could* remove 1/25th of each muffin, give an almost-complete muffin to each of the first 24 students, and give the 24 slivers to the last student. Then everyone gets 96% of a muffin, but it‘s a pretty crumby scheme for the student who gets nothing but slivers. We’d like to do better. Can you find a scheme in which the smallest piece anyone gets stuck with is bigger than 1/25 of a muffin? Can you find a solution in which the smallest piece is a *lot* bigger? After you’ve found the best solution you can and you can’t improve it, how might you try to prove that it’s the best solution anyone could ever find? And how would you solve the problem if there were a different number of muffins and/or a different number of students trying to share them? Puzzles of this kind can be challenging and addictive, and the general solution wasn’t found until last year.

The Muffin Problem was invented in 2008 by a friend of mine, the recreational mathematician Alan Frank. He asked: If we have *m* muffins to be divided among *s* students, how can we cut up the muffins and apportion the pieces so that each student gets *m*/*s* of a muffin, *and* so that the smallest piece is as large as possible? (Making sure that even the *smallest* piece is large ensures that *all* the pieces are large. Kind of like saying “A society is only as rich as its poorest citizens.”) Let’s define *f*(*m*,*s*) to be the largest number *f* for which there’s a scheme in which all the pieces are of size at least *f*. (Here “*f*” is for “function”, “fraction”, and “Frank”.)

Alan was inspired by a real-world problem involving fewer than 24 muffins and fewer than 25 muffin-eaters. In fact, there were just two muffin-eaters, his children, but in dividing up an odd number of muffins between the two kids he began to think about the general case, in the way that a mathematician does. I’m illustrating his problem using the numbers 24 and 25 as an homage to the classic children’s book “The Math Curse” about the math problems that lurk everywhere around us, just waiting for someone like Alan to notice them. Author Jon Scieszka and illustrator Lane Smith raise the problem of how one should share 24 cupcakes among 25 people. Their way of solving the problem contains much practical wisdom, but it dodges the problem of how one should share the cupcakes if one must.

I’ve challenged you to find *f*(24,25). We already know that *f*(24,25) is at least 1/25 because that’s the size of the smallest piece in the “crumby scheme”, but surely you can do better. It’s also not hard to prove that *f*(24,25) is less than 1/2. You can find the solution to the *f*(24,25) problem in Endnote #1.

**TRICKIER THAN IT LOOKS**

As an illustration of how tricky sharing muffins can be, here’s the best scheme for dividing seven muffins four ways. Split six of the seven muffins into a piece of size 7/12 and a piece of size 5/12 and split the seventh into two equal parts of size 1/2 (which I’ll write as 6/12 to give all the fractions a common denominator). Two of the students each get three pieces of size 7/12 while the other two students each get three pieces of size 5/12 and one piece of size 6/12. Here’s a depiction of the scheme in which the seven rows not including the bottom row correspond to the seven muffins and the four columns to the right of the equals-signs correspond to the four students.The smallest piece in this scheme is 5/12 of a muffin, so *f*(7,4) is at least 5/12. In fact, it can be proved that you can’t beat this, so *f*(7,4) is exactly 5/12.

My favorite fact about the muffin problem is the “muffin duality law” first noticed by mathematician Erich Friedman:

*f*(*s*,*m*) = (*s*/*m*) *f*(*m*,*s*),

or in its most symmetrical form,

*m f*(*s*,*m*) = *s* *f*(*m*,*s*).

Duality tells us that there’s a relationship between the problem of dividing 24 muffins among 25 students and the seemingly unrelated problem of dividing 25 muffins among 24 students. This symmetry between muffins and students seems strange, since students are eager to eat muffins while muffins are neither eager to eat students nor eager to be eaten by them (despite what the muffin in The Muffin Song says). Yet the formula is true and easy to prove.

To see via an example why the duality relation holds, notice that if we take the table given above depicting our sharing-scheme for the 7-muffin, 4-student problem, in which rows sum to 1 and columns sum to 7/4, and we flip it diagonally so that rows become columns and columns become rows, we get a table in which rows sum to 7/4 and columns sum to 1; if we then multiply every number in sight by 4/7, we get a table in which rows sum to 1 and columns sum to 4/7. That is, we get a sharing-scheme for the 4-muffin, 7-student problem! What’s more, the size of the smallest piece in the new sharing-scheme is exactly 4/7 times the size of the smallest piece in the old sharing-scheme. And this relationship is a two-way street: each scheme for the 7-muffin, 4-student problem gives a scheme for the 4-muffin, 7-student problem *and vice versa*.

**HISTORY OF THE PUZZLE**

Alan told me about the problem in 2008. I proceeded to share it with the math-fun forum that both Erich Friedman (mentioned above) and I belong to, and the problem quickly went viral in the math-puzzle community. In many forums the puzzle appeared without attribution. This is a shame; we should acknowledge the people who invent new puzzles if we want to motivate people to continue to invent them!

One person who saw the puzzle was Jeremy Copeland, who mentioned the problem to Gary Antonick, who published the puzzle (with cupcakes replacing muffins) in his New York Times Numberplay column. Richard Chatwin saw the column and contacted Antonick who put him in touch with me, and I shared with Richard what I knew. Richard was intrigued but eventually put the problem aside without solving it.

In 2016, computer scientist and mathematician William Gasarch attended the biennial Gathering 4 Gardner conference in Atlanta and saw a booklet of fun math puzzles compiled by Julia Robinson Math Festival founder and director Nancy Blachman. One of the puzzles was Alan Frank’s Muffin Problem. Gasarch quickly got hooked and enlisted some undergraduates to help him study the muffin function *f*(*m*,*s*). He reported his progress at the next Gathering 4 Gardner in 2018:

In that same year Richard Chatwin returned to the fray and independently found an efficient method for computing *f*(*m*,*s*), unaware that math-fun subscriber Scott Huddleston had found the same method back in 2010. In 2019, Chatwin was able to construct a proof that the method always gives the optimal solution (see Chatwin’s article listed in the References). Gasarch was in the process of publishing a book about the muffin problem coauthored with several of his students (see the References); he delayed publication so that the book could discuss the long-sought solution.

Alan Frank’s Muffin Problem fits into existing literature on dissection puzzles. Some problems of this kind were popular in ancient Greece, such as the Ostomachion. A more recent dissection puzzle is the “Haberdasher’s Puzzle” of Henry Dudeney. Dudeney asked solvers to divide an equilateral triangle into as few pieces as possible in such a way that the pieces can be arranged to form a square. The problem appeared in a popular magazine of the day, and while many readers found five-piece dissections, only one reader found the four-piece dissection shown in Endnote #2.

The Muffin Problem can be seen as a one-dimensional dissection puzzle where we are trying to dissect a collection of *m* line segments of length 1 into a collection of *s* line segments of length *m*/*s*. Or, if we prefer a more symmetrical version of the problem (as the duality law invites us to adopt), we can rescale the problem and cast it as a problem about dissecting a collection of *m* line segments of length 1/*m* into a collection of *s* line segments of length 1/*s*. Or, if you prefer, a problem of dissecting a collection of *m* line segments of length *s* into a collection of *s* line segments of length *m*. In any case, what’s novel about Alan’s one-dimensional dissection problem is that, instead of minimizing the number of pieces, we’re supposed to maximize the size of the smallest piece.

**INTO THE FUTURE**

The muffin problem may be solved in the sense that we now have a validated algorithm for computing *f*(*m*,*s*) for any *m* and *s* we like, but the story isn’t over. Once we’ve taken the symmetrical point of view, there’s no reason to limit ourselves to just two variables *m* and *s*. Here’s an example of the sort of variant Huddleston is currently considering: if we have to divide the number 1 into fractions that can be organized into 3 groups that each add up to 1/3, and can be organized into 5 groups that each add up to 1/5, and can be organized into 7 groups that each add up to 1/7, how can we make the smallest of the fractions as big as possible?

An intriguing aspect of the original problem is that *f*(*m*,*s*) turns out only to depend on the ratio *m*/*s*. This is far from obvious to me. For instance: I can see that any good solution to the *m*=7, *s*=4 puzzle gives a solution to the *m*=14, *s*=8 puzzle that’s at least as good, since I can divide the 14 muffins into two groups of 7 and apply the *m*=7, *s*=4 solution to each set. So *f*(14,8) can’t be smaller than *f*(7,4). But I might hope that the (14,8) problem gives me room to find a slightly better solution. That is, I might hope that *f*(14,8) is slightly *bigger* than *f*(7,4). Chatwin’s proof shows that it isn’t, but I’d love to know a simpler proof.

Since *f*(*m*,*s*) depends only on the ratio *m*/*s*, we can define *g*(*r*) for any positive rational number *r* as *f*(*m*,*s*) where *m*/*s* = *r*. For instance, we define *g*(3/10) as the common value of *f*(3,10), *f*(6,20), *f*(9,30), … (For half-baked speculation about how we might define *g*(*r*) when *r* is irrational, see Endnote #3.) This function *g(r*) encodes all the information you need to solve muffin problems: to find *f*(*m*,*s*), just compute *g*(*m*/*s*). But the function *g* is a rather strange beast. It’s well-behaved on some intervals but extremely erratic on others. Here’s a rough plot of what *g*(*m*/*s*) looks like for values of *m*/*s* between 1 and 10:

And here’s a zoom showing *g*(*m*/*s*) for values of *m*/*s* between 1.00 and 1.09:

These are just two of the fourteen snapshots of the graph of *g*(*m*/*s*) that I created using a table of data provided by Gasarch’s students. With these pictures you can appreciate how complicated *f(m*,*s*) is, and you can see why it took so much time, and so many people, to solve the muffin problem!

*Thanks to Richard Chatwin, Alan Frank, William Gasarch, Sandi Gubin and Evan Romer*.

Next month: How Can Math Be Wrong?

**ENDNOTES**

#1. The best solution has smallest piece of size 8/25. I posted this puzzle as a challenge in the Big Internet Math-Off in the summer of 2019; what follows is a slightly adapted version of the solution I received from reader Evan Romer.

Divide the first four muffins as follows:

12/25 + 13/25

11/25 + 14/25

10/25 + 15/25

9/25 + 8/25 + 8/25

Reassemble these pieces into:

13/25 + 11/25

14/25 + 10/25

15/25 + 9/25

leaving 12/25, 8/25, 8/25 left over.

Do this again, five more times (six times in total).

So we’ve used all 24 muffins, and have made eighteen reassembled muffins, 24/25 each.

And we have six 12/25 pieces left over, which make three reassembled muffins.

And we have twelve 8/25 pieces left over, which make four reassembled muffins.

This gives us 18+3+4=25 reassembled muffins, each 24/25 of an original muffin.

The smallest piece was 8/25.

Here’s a proof by contradiction that 8/25 is the best you can do:

Assume that the smallest piece is (8+*a*)/25, for some *a*>0.

Note 0: Every reassembled muffin will be 24/25 of a muffin. And of course every original muffin is 25/25 of a muffin.

Note 1: The assumption implies that every reassembled muffin consists of two pieces only: if a reassembled muffin had three pieces, it would be at least 3×(8+*a*)/25, which is more than 24/25.

Note 2: If an original muffin has a piece taken out of it that is greater than (9−2*a*)/25, then the remainder must be a single piece: for if the remainder were in two or more pieces, the pieces would add up to more than (9−2*a*)/25 + 2×(8+*a*)/25 = 25/25.

Note 3: *a* cannot be greater than 4: for if we had *a*>4, then the smallest piece would be > 12/25, so every reassembled muffin would be > 2×12/25.

Some reassembled muffin has a piece that’s (8+*a*)/25, so its other piece must be (16−*a*)/25.

But if an original muffin had (16−*a*)/25 cut from it, this satisfies the hypothesis of Note 2 because (16−*a*)/25 > (9−2*a*)/25, so the remainder must be a single piece, so its other piece would be (9+*a*)/25.

So some reassembled muffin is (9+*a*)/25 plus (15−*a*)/25.

But if an original muffin had (15−*a*)/25 cut from it, this satisfies the hypothesis of Note 2 because (15−*a*)/25 > (9−2*a*)/25. So some original muffin is (15−*a*)/25 plus (10+*a*)/25.

… and so on …

So some reassembled muffin is (15+*a*)/25 plus (9−*a*)/25.

But if an original muffin had (9−*a*)/25 cut from it, this satisfies the hypothesis of Note 2 because (9−*a*)/25 > (9−2*a*)/25.

So some original muffin is (9−*a*)/25 plus (16+*a*)/25.

So some reassembled muffin is (16+*a*)/25 and (8−*a*)/25.

But now we have a piece smaller than (8+*a*)/25, contradicting our assumption.

#2. Here’s the four-piece dissection published by Dudeney:

#3: It’s natural to wonder whether the magic function *g*(*r*) (the one that tells you the value of *f*(*m*,*s*) by way of the formula *f*(*m*,*s*) = *g*(*m*/*s*)) can be extended to values of *r* that are irrational. Here’s a proposed variant of the muffin problem that might relate to this question. Imagine an infinite supply of 1-ounce chocolates rolling off an assembly line that’s staffed by two immortal and indefatigable employees (call them Ethel and Lucy) who have to feed an infinite line of students. When a new chocolate arrives, Ethel cuts it up into smaller morsels and puts the morsels into an infinitely large holding area; meanwhile, Lucy takes morsels from the holding area and hands them out to students. Each morsel must eventually be given to some student (even if it spends millions of years in the holding area), and each student must get exactly *x* ounces of chocolate (even if she has to wait millions of years to get it). If Lucy and Ethel want the smallest morsel any student gets to be as large as possible, what goal should they shoot for, as a function of *x*? I’m pretty sure that when *x* is rational, this is just our friend *g*(*x*) [nope, I was wrong; see the **UPDATE** below]. But what happens when *x* is irrational? If, say, *x* is the golden ratio ϕ = (1 + sqrt(5))/2 = 1.618…, is there a way for Ethel to divide the 1-ounce chocolates into smaller morsels, and for Lucy to distribute the morsels, so that every morsel gets eaten, and every student gets exactly ϕ ounces of chocolate, and no morsel is less than 0.4 ounces?

**UPDATE**: Evan Romer points out that for any *x* between 1 and 2 we can divide each chocolate of size 1 into a piece of size *x−*1 and a piece of size 2−*x* and then reassemble them to form infinitely many pieces of size (*x−*1)+(*x−*1)+(2−*x*) = *x*. In assembly-line terms, the holding area will get more and more crowded, but every piece added to it will eventually get handed to a student. In the case *x* = 1.618.., the smallest piece has size 1−*x* = 0.382…, which is unfortunately just a hair smaller than 0.4. Can anyone beat 0.382…? Can anyone achieve 0.4? [Yes! See Yoav Kallus’ solution in the Comments.]

**REFERENCES**

Richard Chatwin, An optimal solution for the muffin problem, https://arxiv.org/pdf/1907.08726.pdf.

William Gasarch, Erik Metz, Jacob Prinz, and Daniel Smolyak, Mathematical Muffin Morsels: Nobody wants a small piece. World Scientific, 2020. See also the companion website www.cs.umd.edu/users/gasarch/MUFFINS/muffins.html .

]]>Bear with me if I seem to be veering out of my lane (as they say nowadays), but let me ask: What is chess? If you play with a chess set in which a lost pawn has been replaced with a button, you’re violating tournament regulations but most people would say you’re still playing chess; the button, viewed from “inside” the game, is a pawn. Likewise, if you’re playing against your computer, the picture of a chessboard that you see on your screen is fake but the game itself is real. That’s because chess isn’t about what the pieces are made of, it’s about the rules that we follow while moving those pieces. Asking “Do pawns exist?”, meaning “Are there real-world objects that behave in accordance with the rules of chess?”, misses the point. If one of your pieces has been shoddily manufactured and spontaneously fractures, that doesn’t mean that your mental model of how chess pieces behave is flawed; it’s reality’s fault for failing to conform to your mental model.

You’ve probably already guessed the agenda behind my rambling about chess, but here it is explicitly: I claim that math (pure math, anyway) is as much a game as a science. The objects of mathematical thought, like the pieces in chess, are defined not by what they “are” but by the rules of play that govern them. The fact that in math the pieces exist only in our imaginations and the moves are mental events doesn’t make the rules any less binding. And even though the rules are human creations, once we’ve agreed to them, the answer to a question like “Is chess a win for the first player?” or “Is the Riemann Hypothesis true?” aren’t matters of individual opinion or group consensus; the answers to our questions are out of our hands, irrespective of whether we like those answers or even know what they are.^{1}

**PEOPLE AT A PARTY**

I’ll illustrate my point with a gem of discrete mathematics that can be traced back to a polymathic prodigy named Frank Ramsey who made deeply original contributions to math, philosophy, and economics before dying in 1930 at the age of 26. (For more about Ramsey see the Anthony Gottlieb article and the Cheryl Misak book listed in the References.) I learned most of what I know about the body of mathematical work known today as Ramsey Theory from a great book I read back in the 1980s by Ronald Graham (who died earlier this month; more about him later) and his coauthors Bruce Rothschild and Joel Spencer. Two good article-length introductions to the topic (both originally published in Scientific American) are Martin Gardner’s 1977 article and Graham and Spencer’s 1990 article, listed in the References.

The mathematical gem is a puzzle often phrased in terms of a party attended by six people. Must it be true that you can find three of the six who are mutual acquaintances or three of the six who are mutual nonacquaintances?^{2} That formulation of the puzzle can lead to confusion, hinging on questions like “If A is acquainted with B, must B be acquainted with A?” (answer: for purposes of the puzzle, yes); “Isn’t it possible for there to be degrees of acquaintanceship, with no clear dividing line between acquaintances and nonacquaintances?” (answer: for purposes of this puzzle, it doesn’t matter where you draw the line, as long as you draw it somewhere); “Isn’t being acquainted time-dependent?” (answer: sure, but the claim is that at any given moment you can find three mutual acquaintances or three mutual nonacquaintances).

If you find such real-world issues distracting, you might prefer a pictorial model. If we have six points (call them *vertices*^{3}) arranged in a regular hexagon, and we connect each pair of vertices by either a red line segment or a blue line segment (call these segments *edges*), must it be true that you can find three vertices that are joined up by red edges or three vertices that are joined up by blue edges?

Since it gets tiresome to keep saying “three vertices that are joined up by red edges” let’s call such a trio of vertices a “red triangle”, and likewise let’s call a trio of vertices that are joined by blue edges a “blue triangle”. We have to keep in mind that points where edges cross are not vertices, so the picture below does not actually contain what we mean by a red triangle. Likewise for blue triangles. In making these stipulations we’re not trying to legislate reality; we’re just saying how terms are defined in Ramsey theory.

Here’s a picture that I generated by putting in colors at random. There are 4 red triangles and 2 blue triangles. How quickly can you find them all (if that’s your kind of thing)?

Part of what makes the six-people puzzle tricky is that it’s just on the line between doable and undoable; if instead of six vertices you have a smaller number (or equivalently if there are fewer than six people at the party), finding three mutual acquaintances, or failing that three mutual nonacquaintances, might under some circumstances be an impossible task. For instance, if there’s a party consisting of two men who are acquaintances and two women who are acquaintances but neither of the men is acquainted with either of the women, then there won’t be three mutual acquaintances nor will there be three mutual nonacquaintances. You might enjoy trying to come up with a situation where there five people at a party but there are no three mutual acquaintances and no three mutual nonacquaintances. For an answer expressed pictorially, see Endnote #4.

You might enjoy pondering the six-person party puzzle on your own for a bit, but if you’d like a hint to get you started, try this: Suppose that you are one of the people at the party. You look around at the other five people. If the majority of them are people you’re acquainted with, there must be at least three such people; mentally single them out in some way (imagine them wearing silly hats if you like). Alternatively, if the majority of the five are people you’re unacquainted with, there must be at least three such people; mentally single out three people you’re *not* acquainted with. Either way, consider the situation that prevails among the three people you’ve singled out and how they relate to one another and you in terms of who is acquainted with whom. If you’re still stumped, check Endnote #5.

Although I used parties and colored drawings to anchor the puzzle in physical reality in a hopefully helpful way, the question isn’t a question about the real world. If you attend a party and you find two people who you deem to be neither acquainted nor unacquainted but rather “half-acquainted”, that doesn’t break the logic of the solution to the puzzle; it just means that the assumptions of the puzzle don’t apply to your party. Likewise, if you want to challenge the red/blue dichotomy underlying the puzzle by arguing that there are intermediate colors that some people call red and others call blue (an issue that actually arises in the real world for blue and green, which many people have trouble distinguishing) — then you’re doing psychology or optics but you’re not doing Ramsey theory, in much the same way that Alexander the Great in cutting the Gordian knot wasn’t doing knot theory.

**SIM**

One thing that complicates my tidy analogy between doing math and playing games is that sometimes people invent math to analyze games or invent games to exemplify math. An instance of this that I’ve already written about is the interplay between combinatorial games and Conway’s system of surreal numbers. I’ve also invented a game of my own called Swine in a Line for the purpose of illustrating a mathematical process called chip-firing (which itself is often called a game even though it isn’t one).

A game related to Ramsey Theory is the pencil game Sim, invented by mathematician Gustavus Simmons, played on a sheet of paper initially marked with six vertices arranged in a regular hexagon and no edges connecting them. You’ll need two colored pencils for this game, say red and blue. You and your adversary take turns connecting the vertices with red edges and blue edges (one of you drawing red edges, the other drawing blue) until either a red triangle or a blue triangle is created, at which instant the person who created the triangle loses. The Ramsey theorem I mentioned above — telling us that when every edge has been added in red or blue, there must be a red triangle or a blue triangle (or both) — tells us that when all fifteen possible edges have been drawn if not sooner, there must be at least one such triangle, so the game can’t end in a draw. It’s now known, thanks to a brute-force computer search that inventoried all positions and all moves, that the second player has a winning strategy, but nobody has been able to come up with a simple way to implement that strategy, short of consulting the computer’s annotated list of all positions and all moves.

Ben Orlin (of Math With Bad Drawings fame) is writing a not-yet-named book about mathematical games, and Sim is one of the games he writes about. He actually prefers a slightly different version he calls “Jim Sim”, after his father Jim Orlin who came up with it; in Jim Sim, the six vertices are drawn at the corners of two nested triangles.

Ben prefers this arrangement because it cuts down on the number of places where edges cross other edges and create visual clutter. Personally, I prefer the extra symmetry of the standard version. But mathematically, the difference between ordinary Sim and Jim Sim is purely cosmetic. (Ben proposes a special rule to handle situations where a player has won without knowing it, allowing the losing player to win; it’s cute but mean.)

**PLAYING BY THE RULES**

In any game, it’s important to specify the rules as precisely as possible, but it’s hard to anticipate every conceivable misunderstanding. Some misunderstandings are innocent; others are mischievous. I still don’t know what to think about a misunderstanding in my own life that took place almost fifty years ago. According to a well-known result that long predates (and in a way prefigures) Ramsey theory and the game of Sim, tic-tac-toe always ends in a draw if both players play intelligently. At the age of twelve, I became aware of this fact and boasted to a six-year-old that I couldn’t lose at tic-tac-toe. She promptly “beat” me, playing X to my O, like this:

I wasn’t able to convince her that the game isn’t played this way; she thought I was just being a sore loser. Or was she trolling me?

A lot of math-trolling on the web about standard mathematical facts is based on a misunderstanding of the rules of the game. Some misunderstandings are innocent, others seem willful, and others are hard to classify. For instance, I once saw someone critique a valid algebraic proof saying “But look, you added 0 to the left-hand-side of the equation but not the right hand side! In algebra, you always have to do the same thing to *both* sides!”. I suspect that that person’s real quarrel is not with the mathematical establishment but with a teacher who didn’t describe the rules of algebra with sufficient clarity a decade or two ago.

**THE IMPOSSIBLE PROBLEM**

Whether you know it or not, we’ve been talking about graph theory. That’s the mathematical discipline that allows one to get rid of all the annoying real-world ambiguities inherent in the people-at-a-party scenario. If a mathematician at a party with six or more people, in an effort to be accessible, tells a non-mathematician the people-at-a-party-puzzle, and the non-mathematician willfully and repeatedly tries to dissolve the puzzle by making real-world-based objections, what’s a peeved mathematician to do but to create a domain of the mind in which the mathematical solution is impervious to quibbles because the assumptions of the puzzle hold by definition?

In analytic geometry a graph is a picture representing a function like *y* = *mx* + *b* or *y* = *x*^{2}, but that’s a different context.^{6} In graph theory, a graph is a collection of points and a collection of segments joining those points, called vertices and edges respectively. One special sort of graph with *n* vertices contains all the possible edges joining them; this is called a *complete* graph on *n* vertices, sometimes abbreviated *K** _{n}*. The way a graph theorist would ask the people-at-a-party puzzle is, given any red-blue coloring of the edges of a

What about other values of *a* and *b*? It’s long been known that *R*(4,4) is 18. That is, if you have a *K*_{18} and you color each of its edges red or blue, it must contain a red *K*_{4} or a blue *K*_{4}. Or (going back to the other model) if there is a party attended by 18 people, you can always find four people who are mutual acquaintances or four who are mutual nonacquaintances (or both). And 18 is the smallest number with this property; if you replace 18 by 17 the claim becomes false, as can be shown by analysis of the highly symmetrical figure below (where two circles are connected by a path if and only if the two associated people are acquaintances).

In an astonishingly original piece of work, Ramsey proved that this cutoff phenomenon is universal. Pick any positive integer *k* you like; once the number of people at the party becomes large enough, you can be sure that there will be *k* guests who are mutual acquaintances or *k* guests who are mutual nonacquaintances (or both). The hard part is figuring out what “large enough” means. When *k* is 3, large enough means 6 people; when *k* is 4, large enough means 18 people; and when *k* is 5, nobody knows exactly where the cutoff is, even though Ramsey proved that the cutoff exists! Back when Graham et al. wrote their book on Ramsey theory, all that was known about the Ramsey number *R*(5,5) is that it’s somewhere between 43 and 49 inclusive. In 2017, Vigleik Angeltveit and Brendan McKay were able to shave that 49 down to 48. Similarly, all we currently know about the Ramsey number *R*(6,6) is that it’s somewhere between 102 and 165 inclusive.

Part of what’s poignant about such Ramsey problems is that each one, taken individually, is a question that could in principle be solved by brute force on a computer capable of systematically examining the brutally large but still finite set of possibilities.^{7} Evelyn Lamb takes a look at the issue through the lens of Moore’s Law, and the results are discouraging. Even with Moore’s Law on our side, computers will never be powerful enough to pin down the Ramsey number *R*(6,6) unless we come up with a better way to think about it.

The mathematician Paul Erdős, an early pioneer of Ramsey theory, didn’t think the human race was smart enough to find a better way. He once asked an audience to imagine a scenario in which super-powerful aliens threaten humankind, saying they’ll destroy our species unless we compute some specified Ramsey number. In the case *R*(5,5), Erdős imagined that, if we threw all our computer power and brain power into the effort, with the threat of annihilation providing extra motivation, humanity could determine whether the Ramsey number is 43, 44, 45, 46, 47, or 48. In contrast, in the case of *R*(6,6), Erdős opined that our best chance of survival lay not in meeting the aliens’ challenge but in finding ways to defeat them militarily, because as tough as those aliens are, they can’t be as tough as Ramsey problems. In his book “Inevitable Randomness in Discrete Mathematics”, Jószef Beck writes: “Ramsey Games represent a humiliating challenge for Combinatorics.”

Though Erdős’ parable is farfetched, I can think of something even more implausible: aliens who show up threatening to destroy us unless we prove or disprove the proposition that chess is a win for the first player. Let me be clearer about the scenario I’m discussing and disbelieving. I can swallow aliens who’ve studied our culture, including our quaint game of “chess”, and who taunt us by challenging us to prove something about it. What I *can’t* swallow is aliens who have come up with the game of chess *independently of us*. That’s because there are just too many possible chess-like games and not enough life-bearing planets in the galaxy.^{8} In contrast, I believe Ramsey theory gets invented on most planets that develop space-faring civilizations. Indeed, one of the animating impulses behind Ramsey theory can be experienced when one looks up at the night sky, observes geometric patterns in the stars, and wonders “How many of these seemingly meaningful patterns actually become inevitable once you have enough stars in your sky?”

**RON GRAHAM**

I had the pleasure of knowing Ron Graham. He may be best known to the public at large for having invented “Graham’s number”^{9} in the course of his work on Ramsey theory, but I was very influenced by other work of his (much of it joint with his wife, the mathematician Fan Chung) on quasirandom graphs. He was generous to me in various ways, of which I’ll mention three:

1) In the late 1980s Ron gave me an article to referee for a journal, saying “I think you’ll like it.” Spending a few weeks immersed in that article gave me ideas that were at the the center of my research for more than a decade.

2) Around the same time Ron gave me a puzzle from his office called a Panex. It’s similar to the Tower of Hanoi puzzle, but harder. In the case of the Tower of Hanoi puzzle, which involves moving disks between three spindles, it’s known exactly how many moves are required to solve the problem when there are *n* disks. In contrast, the optimal solution to the general version of the Panex puzzle is still unknown. The Panex that Ron gave me is still in my office. If you want to give the puzzle a try, here’s a virtual version.

3) Back in the early 1990s Ron invited me to attend a conference I’d never heard of. It was a “Gathering 4 Gardner“, the second such gathering in an ongoing series, and these biennial meetings have been a big part of my mathematical life ever since.

Ron also gave me a suggestion thirty years ago that I hope to bring to fruition during the coming year. Visiting MIT in the mid-90s, Ron passed an elevator bay along one of the corridors near the math department and quipped that it ought to sport a sign that said “*L*(*η*)” (if you don’t get the joke, try saying it the way a mathematician would if *L *were a function and *η* were the argument to which the function was being applied). At the time I was just an assistant professor at MIT with no voice in departmental decor. But now that I am a full professor at UMass Lowell, I plan to put up an *L*(*η*) sign next to the elevator bank at our department’s new headquarters.

Ron was responsible for creating a lot of good math, and he also took responsibility for communicating it to the broadest possible audience (where the meaning of “broadest possible” of course depended on how technical the result was). Here’s one morsel I just learned about recently: Consider the operation of taking a string of decimal digits and inserting plus signs between some of them (for instance, 123 could become 12+3 or 1+23 or 1+2+3) and calculate the sum. If you keep performing this “insert-and-add” operation over and over, you eventually get stuck and the game ends; for instance 12+3 is 15 and 1+5 is 6 and the game is over, or 1+23 is 24 and 2+4 is 6 and the game is over, or 1+2+3 is just 6 right away. It’s easy to show that you always get down to a single-digit number, and it’s not much harder to show that you always get down to the same single-digit number no matter how you stick in the plus-signs. This phenomenon (based on the same logic that Martin Gardner describes in his article on digital roots) could be used as the basis of a magic trick that Ron would have liked (see his book with Diaconis in the References). Ask someone from the audience to arrange the digits 1 through 9 in any order, perform the insert-and-add (with them, not you, picking where to insert the plus signs) over and over until a single-digit answer appears, and then have them open an envelope you gave them at the start of the trick, containing a piece of paper on which you’ve written “The final answer is 9.” That’s a nice trick, but it’s not the morsel I’m talking about. Here’s the morsel: No matter what string of digits I give Ron, there’s a way for him to insert plus signs in such a way that after just three applications of insert-and-add, he’ll arrive at a single-digit answer.

My first thought when I read this claim was that I must have misunderstood it. If I start with a typical large number *n*, the sum of its digits is likely to be close to the logarithm of *n*, give or take a factor of 10.^{10} So Graham and his coauthors seem to be saying that if you start with any number *n*, however large, and take the logarithm three times in succession, you get to an answer that’s ten or smaller. You don’t need Graham’s number to see that that’s ridiculous; even as comparatively puny a quantity as

will do.

But do you see the mistake I made? Nobody said Ron has to insert plus signs everywhere. It’s true that if I want the very next number in the number-chain to be as small as possible, inserting plus signs everywhere is the way to go, but that strategy may show its short-sightedness further down the line; the race doesn’t always go to the strongest starter. For instance, consider the starting number 919; if Ron turns it into 91+9 instead of 9+1+9, he can get to the final answer 1 in two steps (91+9 = 100; 1+0+0 = 1) instead of three steps (9+1+9 = 19; 1+9 = 10; 1+0 = 1).

“Fair enough,” you might say, “But Ron got lucky that time; if he inserts plus signs into a string of digits at random, it won’t happen very often that the insert-and-add operation will lead to a number with so many zeroes.”

But wait: Ron doesn’t need this proliferation of zeroes to be something that happens at *random*. As long as there’s *some* way of inserting the plus signs to make lots of zeroes appear, that’s good enough for him. And there are exponentially many ways to insert those plus signs, so he has lots of ways to potentially win.

So maybe it’s not so surprising that if Ron plays the insert-and-add game wisely, he can arrive at a single-digit answer in fewer steps than if he naively puts in plus-signs everywhere. But just *three* steps? No matter how big the original number is? That’s kind of amazing. See the Butler, Graham, and Stong article for details. (It’s a bit more complicated than making sure you get lots of zeroes right away.)

To get a sense of Ron Graham as a person, I urge you to check out George Csicsery’s short video portrait of Graham, entitled “Something New Every Day”.

In addition to being a mathematician, Graham was also an avid juggler, and once said “The problem with juggling is, the balls go where you throw them, not where you wish they would go, or where they are supposed to go.” Once a ball leaves your grip, its trajectory is out of your hands. Graham might have added that, analogously, the problem with math is that it follows the rules you set up. Once you’ve launched a definition into the air, it goes where you sent it, and the theorem that lands in your hands, possibly much later, may be very different from what you expected. At times this discrepancy can be a source of frustration, but for the mathematician it can also be a source of delight.

Next month: The Muffin Curse.

*Thanks to Jeremy Cote, Sandi Gubin, David Jacobi, Andy Latto, Joe Malkevitch, Gareth McCaughan, Ben Orlin, Evan Romer, and Rich Schroeppel.*

**ENDNOTES:**

#1: Here I’m stressing the “internal” view of math. There is an equally important external view that situates math in historical and cultural context, dealing with such topics as “What questions get asked in the first place? Why these particular definitions and rules, and not others?” Nearly all mathematics has some real-world phenomenon as its point of departure, and much mathematics that has left the real world far behind can orbit back and have real-world applications. But that’s not what I’m talking about here.

#2: Here and elsewhere, “or” is meant in the inclusive, “and/or” sense; if you can find three mutual acquaintances *and* three mutual nonacquaintances among the six people, all the better!

#3: Martin Gardner used the plural noun “vertexes”, but I think this is less common nowadays. Have any of you seen this in recent publications? How about “matrixes” instead of “matrices”?

#4: Here’s a graph with 5 vertices whose whose edges have been colored red and blue, containing no red triangle and no blue triangle.

#5: Say the three people you’ve singled out are all people you’re acquainted with. If any two of them are acquaintances, then the two of them, along with you, form an acquainted threesome. On the other hand, if no two of them are acquaintances, then the three of them form an unacquainted threesome. Similar logic applies in the case where the three people you’ve singled out are all people you’re unacquainted with.

#6: The word “graph” is used in rather different senses in analytic geometry and discrete mathematics. Do mathematicians writing in languages other than English have to contend with this ambiguity? Let me know in the Comments.

We allow the edges to cross one another, but we don’t allow an edge to pass through extra vertices en route from one vertex to another. To make the pictures nicer, we sometimes allow the edges to curve, but we don’t allow a curved edge to join a vertex to itself, nor do we allow more than one edge between two particular vertices.

If you push mathematicians hard they will retreat even further from reality and say that a graph isn’t actually a picture at all, but rather an abstract collection of entities called vertices that are unspecified objects of pure thought, along with other entities called edges that are unordered pairs of vertices. So don’t push us.

#7: Remember the “find the 4 red triangles and 2 blue triangles” challenge from earlier? Solving it is just a matter of patience, not cleverness, since there are only twenty trios to try. If you disliked that puzzle, I don’t blame you; I’m more interested in problems that require intelligence. Most famous mathematical problems aren’t reducible to exhaustive examination of finitely many cases; for instance, both Fermat’s Last Theorem (solved) and the Riemann Hypothesis (still open) are “infinitary” in this sense.

#8: If you doubt this, take a look at all the variants listed in the Wikipedia page on chess variants and now imagine all the variants that *could* be listed there but aren’t. If nevertheless you think that chess stands out as one of the best of the bunch, for reasons that would apply throughout the galaxy, please explain!

If you’re wondering why I’m not worried about invasions from other galaxies, it’s because even though interstellar distances are large, intergalactic distances are *ridiculously* large. Any intergalactic being that tried to pop over to slake its thirst for conquest would become millions of years old and eventually forget why it had wanted to come to our galaxy in the first place. Multigenerational voyages wouldn’t solve the problem either: after the initial crew died, later generations would rebel against being cooped up in a spaceship on a journey they’d never get to finish, especially when the whole idea of the trip wasn’t theirs to begin with.

#9: Graham’s number is big. Really big. You just won’t believe how vastly, hugely, mind-bogglingly big it is. I mean, you may think it’d be hard to write out all the digits of nine raised to the ninth power nine times, but that’s just peanuts to Graham’s number. If you’re itching to know more, filmmaker Brady Haran has what you crave. Haran is the creator of the trilogy of mathematico-cinematic blockbusters “Graham’s Number”, “How Big Is Graham’s Number?”, and “Wait, What Is Graham’s Number Again?”, which he followed with a fourth video, entitled “Well, That About Wraps It Up For Graham’s Number”. (The actual names of some videos have been changed to be more like something Douglas Adams would have come up with.)

As far as I know, nobody has proposed a game playable by pan-dimensional beings in which the impossibility of the game ending in a draw is proved by the theorem of Graham and Rothschild that led Graham to invent his number. But if anyone did, the game would clearly deserve to be called “GRim” (in honor of Graham and Rothschild).

#10: Suppose *n* has *k* digits, and suppose that *n* contains each of the digits 0 through 9 in roughly equal proportion, so that the average of the digits is around 4.5; then the sum of the digits should be about 4.5 times *k*, which is about 4.5 times the base ten logarithm of *n*.

**REFERENCES**

Jószef Beck, Inevitable Randomness in Discrete Mathematics.

Alexander Bogomolny, “Ramsey’s Number R(4,4)” at cut-the-knot.org.

Anthony Bonato, “Breakthrough in Ramsey Theory”, The Intrepid Mathematician, April 5, 2017, https://anthonybonato.com/2017/04/05/breakthrough-in-ramsey-theory/ .

Steve Butler, Ron Graham, and Richard Stong, Inserting Plus Signs and Adding, The American Mathematical Monthly, 123(3), March 2016, 274-279, http://orion.math.iastate.edu/butler/papers/16_03_insert_and_add.pdf .

Fan Chung, About Ron Graham, http://www.math.ucsd.edu/~fan/ron/ .

F. R. K. Chung, R. L. Graham, and R. M. Wilson, “Quasi-random graphs”, Combinatorica, volume 9, pages 345–362 (1989), http://www.math.ucsd.edu/~fan/wp/quasirandom1.pdf .

George Csicsery, “Something New Every Day: The Math & Magic of Ron Graham”, Zala Films; https://vimeo.com/136044050 .

Martin Gardner, “Digital Roots”, in “The Second Scientific American Book of Mathematical Puzzles and Diversions”.

Martin Gardner, “Ramsey Theory”, chapter 17 of “Penrose Tiles to Trapdoor Ciphers… and the Return of Dr. Matrix”.

Martin Gardner, “Sim, Chomp, and Racetrack”, chapter 9 of “Knotted Doughnuts and Other Mathematical Entertainments”.

Anthony Gottlieb, “The Man Who Thought Too Fast”, The New Yorker, April 27, 2020, https://www.newyorker.com/magazine/2020/05/04/the-man-who-thought-too-fast .

Ronald Graham, Papers of Ron Graham, http://www.math.ucsd.edu/~ronspubs/ .

Ronald Graham, Bruce Rothschild, and Joel Spencer, “Ramsey Theory”, Wiley Press; available through https://www.wiley.com/en-us/Ramsey+Theory%2C+2nd+Edition-p-9780471500469 .

Ronald L. Graham and Joel H. Spencer, “Ramsey Theory”, Scientific American, Vol. 263, No. 1 (July 1990), pp. 112-117. http://www.math.ucsd.edu/~fan/ron/papers/90_06_ramsey_theory.pdf

Ronald Graham and Persi Diaconis, Magical Mathematics: The Mathematical Ideas That Animate Great Magic Tricks, Princeton University Press; available through https://press.princeton.edu/books/hardcover/9780691151649/magical-mathematics .

Evelyn Lamb, “Overthinking an Erdos Quote”, https://blogs.scientificamerican.com/roots-of-unity/moores-law-and-ramsey-numbers/

Cheryl Misak, “Frank Ramsey: A Sheer Excess of Powers”, Oxford University Press; available through https://global.oup.com/academic/product/frank-ramsey-9780198755357? .

]]>The more you know, the more you forget.

The more you forget, the less you know.

So why study?

The less you study, the less you know.

The less you know, the less you forget.

The less you forget, the more you know.

So why study?

*— “Sophomoric Philosophy”*

Poor Oedipus! The mythical Theban started out life with every advantage a royal lineage could offer but ended up as the poster child for IFS: Inexorable Fate Syndrome. His parents packed him off in infancy to evade a prophecy that he’d kill his father and marry his mother. He was found on a mountain and raised by a shepherd, so Oedipus didn’t know who his birth parents were. Once he learned about the prophecy he did everything he could to avoid fulfilling it (aside from not killing or marrying anyone, which in those times would have been an undue hardship), but he still ended up doing exactly what he was trying not to do.

If the story of Oedipus seems a bit removed from real life, listen to episode 3 of Tim Harford’s podcast “Cautionary Tales”, titled “LaLa Land: Galileo’s Warning”, to hear about systems that were designed by intelligent, well-meaning people to avert disasters but which ended up causing disasters instead.

In Harford’s diagnosis, the problem is that in adding safeguards to a system we increase its complexity, which makes it harder for our feeble human minds to imagine all the ways in which the system might fail. Yet sometimes it’s not complexity that bites back at us but some simple variable that we’ve failed to take into account. For example, say you live on an island with too many wolves. The obvious solution is to encourage wolf-hunting. Unfortunately, suppressing the wolf population means there will be less predation of the deer population (Did I mention there are deer on this island?), which means the deer population will surge next year, which means there’ll be more young deer for mama wolves to feed to their cubs, and come the year after, there’ll be more wolves than ever.

And that’s if you’re lucky. If you’re unlucky, you succeed in killing all the wolves, which leads to an explosion of the deer population, which leads to irreversible overgrazing (Did I mention the grass?), and then your once-complex ecosystem becomes a wolf-free, deer-free, grass-free desert island.

If this had been a real-world example instead of a parable, there would have been lots more species, and the island probably wouldn’t be an island but a region with porous borders that allow wolves, deer, and other animals to cross into the region and out of it. Maybe the overlooked variable wouldn’t have been the deer population but something subtler — something that in retrospect is hard to miss, but in advance is not so easy to pick out of a crowd of jostling variables. Here’s an example mentioned to me by pre-reader Shecky Riemann (can anyone provide a source for his story?). I quote Shecky’s version:

“In some New England town wildlife officials realized the Wood Duck population was inexplicably falling, while they noticed the raccoon population was growing; they believed raccoons were raiding Wood Duck nests for the eggs, causing the subsequent decline. So they hunted, or trapped and relocated, much of the raccoon population, only to find in time that the Wood Duck population declined *even more* precipitously… At that point they discovered that Wood Duck babies were indeed hatching out but upon leaving the nest and plopping into the water (as they do while very young, well before they can fly) they were immediately being taken by snapping turtles, whose population had ballooned because the only thing previously keeping them in check were the raccoons who ate snapping turtle eggs.”

The predator-prey systems I’ve described embody the idea of a negative feedback loop. This is a causal loop between two or more variables (let’s stick to just two and call them X and Y to keep things simple) where making X bigger makes Y bigger (and making X smaller makes Y smaller) but making Y bigger makes X *smaller* (and likewise making Y smaller makes X bigger).^{1} For instance, say X is the severity of a pandemic and Y is the amount of care people take to prevent disease transmission. When the disease is ripping through a population, people get scared and take care, which after a while causes the number of new infections to drop. But then people get lax, causing the rate of new infections to rise again. To the extent that such a simple picture accurately captures key features of the SARS/CoV2 pandemic in 2020, the right question (despite the magazine caption) is not “Will infections rise as states reopen their economies?”, but “How much?”

Why do I call this essay “The mathematics of irony”? I won’t be so foolish as to say that all, or even most, irony comes from negative feedback loops. But a lot of irony comes from reversal of expectations, and the presence of negative feedback loops involving variables you’ve ignored can be one reason for reasonable interventions to backfire.

This isn’t news to anyone who studies complex systems. The problem is that scientists and science educators (and I include myself among their number) haven’t done a good enough job of explaining the complexity of reality, and overcoming people’s desire for simple answers by exploiting people’s love of a good story. Too many members of the voting public see a sentence like “Population dynamics can be counterintuitive” as a cowardly equivocation or an outright lie, and are all too ready to throw in their lot with someone who runs on a simple, thrilling campaign slogan like “I will kill the wolves.”^{2}

Many physical systems can be understood through the lens of feedback, as seen in something as simple as a pendulum. If you pull the bob to the right, the rightward deflection gives rise to leftward acceleration, which gives rise to leftward velocity, which gives rise to leftward deflection, which gives rise to rightward acceleration, which gives rise to rightward velocity, which gives rise to rightward deflection, and so on.

Here’s a simplified overview of what happens. The four sketches in the middle show, in each of four illustrative cases, where the pendulum bob is (the dot) and where it’s going (the small arrow), and the descriptions on the outskirts give verbal descriptions of each case; the big arrows along the outside show how the system evolves over time.

The mathematics governing this system, expressed in the form of a pair of mutually referential differential equations, has many similarities to the mathematics of an oscillating spring, or an inductor in an electrical circuit. In fact, if you ignore nonlinearities^{3} in the equations, you’ll find that the equations for a pendulum become formally identical to the equations for an electrical circuit; the quantities in one system (deflection, velocity) are different from the quantities in the other (current, voltage), but the way a given quantity in one system evolves over time is identical to the way the corresponding quantity in the other system evolves. The underlying schema is “the” simple harmonic oscillator, one of the stars of an undergraduate physics education.

I hasten to say that no scientist would call a pendulum a negative feedback system as I have, but it isn’t for mathematical reasons; it’s because scientists normally use the term “feedback” to describe a relationship between one part of a system and another (wolf-population and deer-population, say), whereas the deflection and velocity of a pendulum bob aren’t different “parts” of a system — they’re different ways of describing *one* part. But the mathematics of feedback loops doesn’t know about parts and wholes; it only knows quantities and how they evolve in time under the sway of differential equations. And from that point of view, a simple harmonic oscillator could be considered the prototype of a system with oscillatory behavior due to negative feedback. Systems with negative feedback loops give rise to oscillating behavior, with oscillations that can decay over time, grow without bound^{4}, settle down into a stable cycle, or approach a single stable state.

Fiction is rife with cycles arising from negative feedback loops. A classic kind is the time-travel paradox: if you travel back in time and kill your grandfather before he has kids, then your father never exists, so *you* can’t exist, but then you never travel back in time, so your father does end up existing, and so do you, with the result that you do end up traveling back in time after all, etc. (Maybe calling this a feedback loop is a bit of a stretch, since it involves discrete variables — you’re alive or you’re not, your grandfather is alive or he isn’t — rather than continuous variables of the kind we’ve discussed so far.) You can also see the negative feedback loop governing a fictional couple for whom “A approaches B” leads to “B avoids A” leading to “A avoids B” leading to “B approaches A” leading back to “A approaches B”. If you have favorite examples of negative feedback loops in life or in literature, please let me know in the Comments.

My favorite example of a negative feedback loop in my own life comes from a time — and I hasten to say that this happened *many, many years ago* in case anyone who works for my auto insurance company is reading this — when I was about to leave my home to teach a calculus class and foolishly tried to save a bit of time by adjusting the seat while starting to drive. To slide the driver’s seat forward, one had to reach underneath the seat and pull on a release lever that would allow the seat to slide freely on its tracks. That’s what I did, and if I’d been smarter I would have moved the seat to its new position and let go of the release lever *before starting to drive*. But instead I started the car and put my foot on the gas pedal while my seat was still free to move forward and backward. Can you guess what happened before reading on?

The car started moving forward, but remember, objects at rest tend to remain at rest, so my seat (being free to slide) stayed put with respect to the street, which is to say that my seat moved backward with respect to the car. This caused my foot to leave the gas pedal. That caused the car to slow down, which caused my seat to move forward with respect to the car. That movement pressed my foot against the pedal again, which caused the car to lurch forward again, and so on. The negative feedback loop was unstable, so each successive motion of the car was more violent than the one before, until finally the car (which had manual transmission) stalled out.

In the twentieth century, back before the prefix “cyber-” got repurposed to mean “computer-y”, there was a worldview called cybernetics that saw feedback loops everywhere.^{5} Cybernetics bloomed in conjunction with an engineering discipline called control theory. A key concept of both fields is homeostasis, an equilibrium state achieved through use of a feedback loop. An example of an engineered feedback loop is the thermostat, which allows cooling to happen when something becomes too warm, and warming to happen when it becomes too cool. In the natural world, an example of equilibrium is seen in the predator-prey model I mentioned before. And you can thank feedback mechanisms built into your warm-blooded human body for keeping your core temperature in a zone that permits you to be alive right now. (Not to mention dozens of other metabolic variables that your body is always silently, cybernetically adjusting.)

The cybernetic worldview rightly stressed the importance of feedback loops (back in college I was especially enthralled with Gregory Bateson’s “Steps to an Ecology of Mind”), but many proponents of cybernetics thought that the study of feedback loops would explain everything. It didn’t, and now the word “cybernetics” itself has become an oddity, known to many only through the name of the fictional Sirius Cybernetics Corporation invented by Douglas Adams as part of his “Hitchiker’s Guide” universe. Cybernetics had a great first act, but it ran out of new ideas with predictive power. And in a way that’s a shame, because the key insight of cybernetics is one that, regardless of whether it leads to new scientific advances, could lead humankind to a more sophisticated understanding of cause and effect.

Cybernetics was followed by catastrophe theory, chaos theory, complex systems theory, and so on. Each new wave yielded new explanatory power, and beyond that, new metaphors. With the new metaphors came hype. And within the scientific establishment there was some backlash to the hype (yes, feedback loops pop up everywhere once you start looking for them). I feel torn: hype has no place in scientific research, but in popular writing about science, tasteful enthusiasm for deep ideas has the power to awaken wonder and inspire an appreciation of just how beautifully complex the world is.

But I never told you what happened to Oedipus after he fulfilled the prophecy. When he was king of Thebes there was a great plague, and in trying to figure out what had caused it, he came to realize that he himself was the cause. And he acknowledged his culpability publicly, with an act of self-mutilation that symbolized his former blindness to the machinations of fate.

So, pity Oedipus. And while you’re at it, get ready to pity yourself and me and everybody else in the year 2020, because in the current year and the years after, many intelligent and well-meaning people will be doing their best to steer two complex systems (the biosphere and the world economy) that we don’t understand. We’re flying blind, and there’s a good chance that some of our interventions will backfire and have ironic consequences for reasons that will only be obvious in retrospect.^{6}

POSTSCRIPT:

Having written the preceding paragraph in early June and having re-read it in mid-June, I think our biggest problem isn’t that we humans will take action based on simplistic models of how the world works and that those actions will have perverse consequences. I think a bigger problem is that we’ll know the right thing to do and we’ll nevertheless fail to take the simple steps that actually work, just because we get tired of doing the right thing, day after day. It’s so hard. It was hard from the start, but at least at the start it was something different to do. Now it’s hard and *boring*.

And now that I’ve re-read *those* words, I see an even gloomier possibility. It seems clear that, at a societal level, abandoning social distancing now is self-destructive. But what if all the people going around in public without face masks this week are, in a certain sense, making completely rational individual decisions? After all, the main purpose of wearing a standard-issue mask is to protect others, not yourself. If you’re wearing a mask when no one else is, you’re putting up with discomfort and inconvenience for the sake of other people and their acquaintances — people you don’t even know^{7}, and if your action turns out to save someone’s life you’ll never find out about it. You’d be *so* much more comfortable if you weren’t wearing the mask; so taking off that mask would be the *rational* thing to do, wouldn’t it?

This pandemic won’t kill off humankind, and neither will the next. But if a species that evolved intelligence for its survival-value ends up going extinct because of the selfish rationality of its individual members, that might be the biggest irony of all.

*Thanks to Sandi Gubin, Dave Jacobi, Andy Latto, Fred Lunnon, Gareth McCaughan, Shecky Rieman, Evan Romer, and Steve Strogatz.*

Next month: Math, games, and Ronald Graham.

**ENDNOTES**

#1: This should not be confused with a situation in which making X bigger makes Y smaller and likewise making Y bigger makes X smaller. That might naively seem “even more negative”. But that’s an example of *positive* feedback, in terms of how a change in X will tend to reinforce itself rather than reverse itself over time.

#2: The cowardly lie “Nobody knew wildlife-management was so complicated” can be saved until after the candidate is safely elected.

#3: Here I’m assuming that the deflection angle *θ* is so close to 0 that we can replace sin *θ* in the differential equation by *θ*, resulting in a linear differential equation that gives a good approximation to the pendulum’s actual behavior. Nonlinear differential equations are extremely important, but nonlinearity isn’t one of the themes of this essay.

#4: When oscillations grow without bound in a linear model of a phenomenon, the upshot in the real world is that the oscillations grow until the system leaves the regime within which linearity is a good approximation to reality.

#5: The word “cybernetics” is derived from the Greek word κυβερνήτης from the Greek root meaning to steer, navigate, or govern, and indeed “governor” was the name James Watt gave to the feedback mechanism of his steam engine.

#6: For instance: having everybody stay home during a pandemic seems like a good way to prevent viral transmission, and it probably is, on a societal level. But if the severity of an infection depends on initial viral load, this strategy has the unintended effect that uninfected people who shelter with infected people stand to get worse cases of the illness. I learned about this from Siddhartha Mukherjee’s article listed in the References. It’s not clear to me how many epidemiologists or public health officials took this into account in the earliest days of the coronavirus pandemic. Who knew? Did they try to tell us? Were we listening hard enough?

#7: I’m thinking of words from the Twilight Zone episode “Button, Button” that we hear more than once (with chilling implications the last time we hear it): “… someone whom you don’t know.” If there’s a more horrifying dramatization of the tragedy of the commons, I haven’t seen it.

**REFERENCES**

Tim Harford, “LaLa Land: Galileo’s Warning”, http://timharford.com/2019/11/cautionary-tales-ep-3-lala-land-galileos-warning/

Siddhartha Mukherjee, “How Does the Coronavirus Behave Inside a Patient?”, The New Yorker, March 26, 2020. Yes, I know this article is reportage, not science. If any of you can point me toward relevant medical literature, please do so in the Comments.

]]>“So let me get this straight, Mr. Propp: you plan to go to England to work with a mathematician who doesn’t even know you exist?”

It was 1982, I was a college senior applying for a fellowship that I hoped would send me to Cambridge University for a year, and the interviewer was voicing justified incredulity at my half-baked plan to collaborate with John Conway.

I’d read about Conway and his multifarious mathematical creations in Martin Gardner’s Mathematical Games column in Scientific American, and I’d become an ardent fan; I’d devoured his book “On Numbers and Games” and I’d even done some epigonic^{1} work on my own, trying to extend the theory of two-player games to allow for a third player. But I hadn’t even taken the step of writing to the man, and I had to sheepishly admit as much to the interviewer.

“You sound a bit like Luke Skywalker heading off to meet Yoda,” the interviewer said. His jest made me worry that I wouldn’t get the fellowship, but he must have believed in me more than his joke suggested. I was awarded a Knox Fellowship, and later that year I went to England on a Knox, as I liked to say (enjoying the resulting homophonic confusion).

**“DR. CONWAY WANTS TO TALK TO YOU”**

Conway was a celebrity among mathematicians but hadn’t risen to the top academic rank at Cambridge University. Perhaps that was partly due to his refusal to draw a line between the serious and the playful the way most mathematicians do. After he’d discovered three eminently respectable algebraic structures called the Conway groups^{2}, he’d resolved that from then on he would devote himself to whatever interested him regardless of what other people thought. This resolution showed itself clearly in his subsequent output. Conway’s most profound and distinctive contribution to mathematics, his theory of surreal numbers^{3}, was shot through with inspirations coming from the study of games, and the achievement he was best known for in the broader world was his invention of a kind of computer-aided solitaire called Conway’s Game of Life. And, unlike most mathematicians, Conway didn’t confine his research to one particular area; his breadth of interests would have smacked of dilettantism if he hadn’t made fundamental discoveries in the topics he turned his attention to. It was hard for more traditional academics to know what to make of him. When I arrived at Cambridge University, Conway was a Lecturer, not a Professor.

Later on, I realized I’d been lucky that Conway wasn’t off taking a sabbatical somewhere (perhaps in the U.S.) during the year I’d left the U.S. to work with him in England!

I found Conway in the Trinity College Common Room one day. He was (as he would remain throughout his life) happy to have a conversation with a stranger. I introduced myself and told him about the work I’d been doing on three-player games. I was looking at what are called *impartial* games, of which the prototypical example is Bouton’s game of Nim. In Nim, the “board” consists of one or more heaps of counters, and a legal move for a player is to take away as many counters as the player wants from a single pile. The player who makes the last move wins. What makes the game “impartial” is that every move that is available to one player is available to the other. There’s a beautiful mathematical theory of how to win impartial two-player games, and I told Conway that I wanted to extend it to three players who take turns in cyclical fashion.

If we’re watching a three-player impartial game in progress, and we freeze the action, there are four possibilities. First, the player who’s about to move (call her Natalie) could have such a strong position that, if she keeps her wits about her, she can guarantee that she’ll make the last move and win the game, regardless of what her two adversaries do. Second, the player who gets to move after Natalie (call him Oliver) might have a strategy that lets *him* win, no matter what. Third, the player who gets to move after Oliver (call him Percival) might have a winning strategy against the other two. Lastly, it’s possible that *no* player has a winning strategy — that any two players have the ability to defeat the third (leaving aside the issue of how coalitions might form or dissolve, or how agreements to share a prize might be enforced). I called these four cases N, O, P, and Q. Much of my preparatory work on the problem of classifying positions in three-player games could be summarized in the following table, whose simplicity hides the amount of work required to justify it:

For instance, the upper left entry signifies that if we have two positions, each of which is of type P, and we smoosh them together, then the resulting position is either of type P or of type Q.

I showed Conway the table and described what I was hoping to do next. What I didn’t tell him — what I didn’t have the guts to tell him — is that I was hoping to do the work with him. I mean, who was I, not even a proper graduate student, to suggest that it would be worth Conway’s time to collaborate with me? John expressed approval of my research plan, wished me luck, and encouraged me to let him know how things went. And that, I assumed, was that.

A few days later, one of my flatmates told me that while I’d been off attending classes there’d been a telephone call for me from some secretary at DPMMS (the Department of Pure Mathematics and Mathematical Statistics). When I called her back, she said “Dr. Conway wants to talk to you,” but she wouldn’t tell me more, except to say that he had seemed upset. She arranged a place and time for Conway and me to meet, and when I met him, he began by apologizing.

“This is your project, not mine,” he said, “but I couldn’t stop thinking about it, and I even did some work on it. I’m sorry. I don’t usually do things like this.”

I reassured him that I’d hoped all along that we’d work on three-player games together. He was relieved, because he said he didn’t think it was right for someone to poach someone else’s research project (especially that of a younger person).

As it turned out, guilt wasn’t the only thing bothering John; he was also frustrated mathematically. “I was able to prove all the claims in your addition table,” he said, “but I couldn’t figure out how to prove that the sum of two type-O positions can’t be another type-O position.”

“Oh yes, that’s the hard one,” I said, and I showed him how the proof went.^{4} And so my collaboration with John Conway was launched.

I hasten to say that the proof I showed John was totally in the style that I’d learned from his book, and that it didn’t use any tricks he didn’t already know. It just took more work. Consider that I’d had months to work on the theory; John had only learned about it from me a few days before. I had no doubt then (and have no doubt now) that if John had set aside a few hours for the task, as I had done earlier that year, he would’ve found the proof. But I won’t deny that it was a huge boost to my ego to know that I’d proved something about games that had stumped him the first time he tried to prove it.

**WITH CLIPBOARD AND BABY**

Usually we would meet at a local coffee shop called Fitzbillies. John was the scribe, filling page after page with calculations. One time, shortly after the birth of his son Oliver, we worked at his home, where I got to meet his then-wife, the mathematician Larissa (“Lara”) Queen. We entered his home through the back door, set in a featureless wall that faced a parking lot; it reminded me a bit of Bag End from “The Hobbit”. Sometimes he’d bring little Oliver to the cafe with us, and John would somehow balance the clipboard and the baby and do math while keeping his young son happy.

I wish that I’d saved some of those pieces of paper, or that I even remembered some of what was on them. The phrase “tribal markings” has stayed with me, as the name John gave a system for discriminating between positions based on how they behaved when you added *k* Nim heaps of size 1 to them, for *k* = 1, 2, 3, … . Ultimately, what sank the enterprise (or at least my enthusiasm for it) was that John’s extension of my theory didn’t seem to apply to any actual games in an interesting way. In nearly all positions in nearly all three-player impartial games, any two players can gang up on the third if they make a plan and stick to it. Years later, it occurred to me that John and I should have taken a break from delving into 3-player impartial games and taken a look at 4-player impartial games. Even though in most such games any three players can gang up on the fourth, one can look at how one two-player alliance fares against another, and we might have found something worth publishing. In the end, many years later, I did publish an article on three-player games, but it didn’t include any of the work I’d done with John in Cambridge.

I also attended a class John taught on Games, Groups, Lattices, and Loops, and while I didn’t warm to most of the topics he covered (perhaps because I didn’t put in any time playing with them on my own), I was struck by the way his ideas about games turned out to play a role in his work on lattices, codes, and packings, with connections that became even clearer in the decade that followed (see his article with Sloane listed in the References). What are the chances that a mathematician who loved games would have the luck to find that games secretly underlie other subjects he studies? It almost seemed as if mathematical reality was bending itself to his will — that he had “root access” to the Platonic realm of pure form.

Of course, I don’t really believe that. The likeliest explanation is that what attracted John to work in these areas was some subtle affinity between them — an affinity that reflected some hidden mathematical substructure they had in common. Which problems seize a mathematician’s fancy, and which ones leave a mathematician’s soul unstirred? These things are as mysterious as physical attraction, but just as some people have a physical type they’re attracted to, I think John had a “type” in the mathematical domain, so that even though his interests were broad, there is something “Conway-ish” about the problems he tackled, and he had a keen sense of smell when it came to scenting out Conway-ish problems.

**PRINCETON AND ELSEWHERE**

Not many years after I came to visit him in England, John moved to the U.S. and became a professor at Princeton. He traveled a lot, giving talks at conferences across the country, participating in research retreats, and even spending time at mathematical summer camps for high schoolers and middle schoolers. I went to many of Conway’s talks, some of them rather outrageous (I remember the time when, lacking a damp paper towel, he licked an overhead slide clean so that he could re-use it), but the thing that I found most striking is that he never gave the same talk twice. For him, giving a talk was an improvisatory performance, an extension of his love of one-on-one conversation.

I remember a time when I was hoping to snag Conway’s interest in a problem that struck me as Conway-ish, hoping to recreate the sort of collaboration we’d had in 1982. I drew a triangle with some lines cutting through it, something like this,

and then I asked Conway if he could add more lines so that all the small regions cut out by the lines were triangles. He thought a bit, and drew some lines:

I nodded, and then said “Do you think that if I draw *any* finite number of lines passing through a triangle, there’s always a way for you to to add more lines so that *all* the small regions cut out by the lines are triangles?” He said “Let me think about that,” and as the curiosity-bug bit into his brain, he began to draw pictures, make observations, and formulate conjectures. But he was no fool; he could see that I was deliberately trying to entice him into working on the problem, and he was too proud to want to be seen as one who is so easily seduced. He shuddered as if shaking off an unpleasant memory and said “You know, I don’t have to work on just ANY damned problem!”

The problem is still unsolved, as far as I know. (For more info, see the Math Forum webpage listed in the References.)

**ATLANTA**

During the past twenty years, most of my conversations with Conway took place in Atlanta, at a meeting of math-y, magic-y people held every two years called the Gathering for Gardner. (Actually the “for” is officially supposed to be rendered as the number “4”, but I find the cutesiness a bit too much.) I never had a chance to talk to Gardner himself at one of these Gatherings; he stopped attending in the late 90s, partly because he wanted to be with his wife who didn’t like travel, but mostly because he hated adulation.^{5}

John, on the other hand, never minded being the center of attention, and rarely missed a Gathering. He loved to talk, and people loved to listen. He, however, didn’t always like to listen (especially as he grew older), and he seldom went to the formal talks held in the ballroom, preferring to linger in the anteroom and converse with whoever was interested in talking to him. This became a problem for me, because I didn’t want John to talk to just anyone — I wanted him to talk to *me*: about frieze patterns, boundary invariants for tilings, sphere packings, surcomplex numbers, group theory, knot theory, etc. John, however, was just as happy to perform magic tricks for strangers as he was to discuss our shared mathematical interests. Or perhaps he sometimes found my company dull, and was happy for the relief provided by other interlocutors eager to chat with him on other topics?

One of the last times I saw Conway was at a Gathering for Gardner in Atlanta in 2014. This Gathering was held specifically in John’s honor, and I gave a talk there on his indirect contribution to the theory of random tilings. Characteristically, he wasn’t at the talk; he preferred one-on-one conversation to attending presentations. If you were there that year, you might at one point have spied me sitting in the anteroom on the floor at Conway’s feet, and you might have thought it looked odd. Why had I adopted this undignified position? Because I had Conway’s ear, I needed to sit, there was no other chair, and I feared that if I left to get a chair, someone else would snag his attention and I’d never be able to finish the conversation.

It’s only recently occurred to me that, to the extent that Conway in later life became a less considerate person, the attention of fans like myself may have played a facilitating role. One reason people behave as well as they do is that bad behavior comes at a social price. If you’re an ordinary person, spending most of your time in a particular place, hanging out with a limited supply of people, and you’re rude to enough people for a long enough time, you’ll eventually run out of people who seek your company. But when you’re a star the way Conway was, there’s always another eager fan to shower you with attention, no matter how many people you’ve alienated. John was never unkind to me (the rudest thing he ever said to me, after I made some intelligent comment, was “You’re not as dumb as you look” and I think he meant it affectionately), but I’ve heard from a few others (women, I’m sorry to say) to whom he was not so polite, or to whom he displayed a creepy kind of attentiveness. I’ve written elsewhere about geniuses, but it now seems to me that a deeper problem has to do with how communities choose heroes, and how communities treat those heroes. I feel torn between two uncomfortably clashing beliefs: that heroes are necessary or at least inevitable, and that hero-worship damages the souls of the worshipper and the worshipped. Maybe some readers of this essay have thought more deeply about this than I have and will have useful insights.

**THE END**

Conway succumbed to COVID-19 in April 2020. I’m glad that while he was still alive I let him know how big a role he played in my life (something I neglected to do in the case of Martin Gardner). I think it’s fair to say that he was the Beatles of mathematics, not just because he was from Liverpool, but because so much of John’s work is so damned catchy. Just as many Lennon-McCartney songs have a memorable “hook”, many of John’s best creations have a way of sticking in the mind once you understand them, in a way that most mathematical discoveries don’t. A layperson with an interest in mathematics can get a surface appreciation of John’s work in a way that just isn’t possible for 99 percent of contemporary mathematical research. Here’s one example of an especially accessible Conway theorem (an isolated aperçu as opposed to a piece of a bigger story): If you extend the sides of triangle *ABC* as shown, to points *A** _{b}*,

It’s hard to believe that as simple a geometric proposition as Conway’s circle theorem could have lain undiscovered for more than a score of centuries, but it did. (For a beautiful proof-without-words of this proposition, see the proof by Colin Beveridge listed in the references.)

It’s easy for songwriters to feel that all the best tunes, chord progressions, and hooks have already been used by the songwriters who came before. Likewise, if you’re a pure mathematician whose job is to create new games of pure thought, it’s easy to feel that all the beautiful simple ideas have already been thought of — that our forebears have already turned the mathematical topsoil, leaving us the more arduous task of cutting through rock in search of undiscovered gems. The main thing I learned from John is that even if the supply of beautiful yet simple mathematical truths is in some sense finite, we’re nowhere near the bottom of it. Conway’s career is an existence proof that a career like his is possible, or at least was possible up through the year 2020. I hope and believe that at least throughout my lifetime there’ll still be plenty of scope for mathematicians of his temperament to find new thought-games that somehow manage to be compellingly simple yet enduringly deep.

*Thanks to Tibor Beke, Nancy Blachman, Sandi Gubin, David Jacobi, Joe Malkevich, Evan Romer, and Shecky Riemann.*

Next month: The Mathematics of Irony.

**REFERENCES**

BAAM! (Bay Area Artists and Mathematicians) and G4G (Gathering 4 Gardner), Remembering John Conway, https://youtu.be/Ru9fX3VPR9Y

Matt Baker, “Some mathematical gems from John Conway”, https://mattbaker.blog/2020/04/15/some-mathematical-gems-from-john-conway/ .

Colin Beveridge, Conway’s circle, a proof without words, https://aperiodical.com/2020/05/the-big-lock-down-math-off-match-14/ .

John Conway, On Numbers and Games.

John Conway and Neil Sloane, “Lexicographic Codes: Error-Correcting Codes from Game Theory”, IEEE Transactions on Information Theory, Vol. IT-32, No. 3, May 1986; available at http://neilsloane.com/doc/Me122.pdf .

Donald Knuth, Surreal Numbers: How Two Ex-Students Turned on to Pure Mathematics and Found Total Happiness, 1974.

MathOverflow, Conway’s lesser-known results, https://mathoverflow.net/questions/357197/conways-lesser-known-results .

James Propp, “Three-player impartial games”, Theoretical Computer Science 233 (2000), pp. 263–278; available from https://arxiv.org/abs/math/9903153.

Jim Propp, “Conway’s impact on the theory of random tilings”, talk presented at G4G11 in 2014; vodeo at https://www.youtube.com/watch?v=e_729Ehb4vQ .

Siobhan Roberts, Genius at Play: The Curious Mind of John Horton Conway.

Siobhan Roberts, “Travels with John Conway, in 258 Septillion Dimensions,” New York Times, May 16, 2020; at https://www.nytimes.com/2020/05/16/science/john-conway-math.html .

Stan Wagon, Math Forum Problem-of-the-Week 812, “A Pre-Sliced Triangle”; at http://mathforum.org/wagon/spring96/p812.html .

**ENDNOTES**

#1. The term “epigone” is usually an insult; who wants to be called “second-rate”? But if the tier extends beyond second-rate to third-rate, fourth-rate, etc., being second-rate isn’t so bad! And it’s no disgrace to be deemed not-as-good-as-Conway.

#2. The Conway groups are examples of algebraic structures called finite simple groups. There are several infinite families of finite simple groups and then twenty-six “bonus” finite simple groups that we call sporadic, including the three Conway found. The largest of the sporadic finite simple groups is called the Monster, and near the end of his life Conway confided that, although it was his fondest wish to understand why the Monster existed, he doubted that he would live that long.

#3. The surreal number system is an extension of the ordinary real number system that includes infinite and infinitesimal quantities as well as familiar numbers like the seventeen and the square root of two. The term “surreal numbers” was coined by Donald Knuth, whose book on the subject occupied me for many happy hours when I was in high school.

#4. In my article “Three-player impartial games”, the proposition that gave Conway trouble appears as Claim 7, and it hinges on five of the six preceding claims.

#5. I would sometimes bring students to attend the Gathering with me. One student’s father had initial misgivings about having his daughter attend some sort of strange convocation held in honor of a man who didn’t even show up. I’m guessing it reminded him of those ashrams in the U.S. that are dedicated to the teachings of a guru back in India. To him, Gardner-fandom seemed like a bit of a cult. And I can’t say he’s completely wrong.

]]>This sort of invocation of chemistry as a magic history-spanning bridge can be traced back to James Jeans, the English scientist and mathematician, who in his 1940 book “The Kinetic Theory of Gases” wrote: “If we assume that the last breath of, say, Julius Caesar has by now become thoroughly scattered through the atmosphere, then the chances are that each of us inhales one molecule of it with every breath we take.” The science writer Sam Kean recently wrote an entire book, “Caesar’s Last Breath”, that takes this proposition as its starting point.

In between Jeans and Kean, other writers making the same point have replaced Caesar by Archimedes or Jesus or da Vinci. I prefer Archimedes, because he was the first of the ancient Greek mathematicians to come to grips with really big numbers and to connect the macroscopic and microscopic realms; in “The Sand Reckoner” he calculated how many grains of sand would fill the universe as the Greeks understood it.

As I write this essay in April 2020, human society has been violently tipped on its side, and the eight billion or so people who share this planet have come to realize how small the world has become epidemiologically. We’ve also become fearfully conscious of the contents of the air we bring into our bodies. Perhaps now is a good time to take a deep and hopefully healthy breath and think a bit about how the content of our lungs connects us to people far away in space and time, situated in a past that, even at a remove of a few months, feels very distant.

Molecules are tiny; Earth is huge; we’re somewhere in between. Our brains didn’t evolve to handle the difference in scale between microscopic events and events of daily life, or between events of daily life and global processes. We can fling around words and phrases like “nanotech” and “trillion dollar deficit”, but few of us really *get*, on a gut level, how small a nanometer is or how big a trillion is.

And yet, neuroanatomical evolution has already developed a wonderful approach to the problem of scale. Consider for instance the human ear; it must process a gamut of frequencies from 20 to 20,000 cycles per second. It does this using an organ called the cochlea, whose thousands of tiny hair cells respond to different frequencies. When the ear picks up a tone, the position of the hair-cell along the cochlea that responds to that particular tone corresponds roughly to the logarithm of that tone’s frequency — which may sound intimidating if you’re rusty with logarithms, but if you’ve ever played a piano you have an intuitive, kinesthetic sense for the logarithms of frequencies. Each time you shift your hand up an octave, you double the frequency of the note. Frequency is an exponential function of the position of your hand, and conversely, the position of your hand is the logarithm of the frequency produced. The cochlea is just like that, except that it’s receiving sound, not producing it. Some have called the cochlea an “inverse piano” to highlight this analogy.^{1}

We can use exponentials and logarithms to try to get a handle on the large and small, but it’s easy to forget the key difference between counting “one, two, three, four, …” and counting “thousand, million, billion, trillion, …”: the former is an *arithmetic* progression (each term is equal to the previous term *plus* something, namely 1) while the latter is a *geometric* progression (each term is equal to the previous term *times* something, namely 1000). In more concrete terms: If we plot the four numbers one, two, three, and four on a number line, we get this:

On the other hand, if we plot the four numbers one thousand, one million, one billion, and one trillion on a number line, we get this:

Were you expecting to see four dots? Well, the “dot” at the left is actually two dots, one for one thousand and one for one million; at this scale the two dots are too close together to be distinguished. Meanwhile, the dot for one trillion is about a mile off to the right.

If you’ve never seen the video “Powers of Ten” or the similar video “Cosmic Zoom”, I suggest you take a break from reading this essay and watch one or both of them. Touring the universe from the largest scales we know about to the smallest is a great way to get a feeling for how the different levels of our universe fit together. The largest structures we know about are roughly 10^{41} times larger than the smallest. We’re somewhere in between, on a cosmic piano that has roughly a hundred and forty octaves.^{2}

When one does calculations that involve big things, small things, and things that are in between, one sometimes finds that the in-between things are close to the midpoint on a logarithmic scale, in the way that middle C is close to the midpoint of a piano keyboard. One example of this phenomenon is the proposition that there are about as many molecules in a teaspoonful of water as there are teaspoonfuls of water in all Earth’s oceans (about 200 sextillion in both cases). A more mind-boggling example is one I learned from Bill Gosper, who computed that a molecule of polyethylene^{3} spanning the observable Universe, suitably folded, would just about fit in NASA’s Hangar 1, one of the largest buildings ever constructed. Another phenomenon along similar lines is the way you can use an oil drop to measure the wavelength of visible light; the drop is much larger than the wavelength of light, but ponds are much larger than droplets, so when you let a droplet spread evenly over the surface of a pond, you can create a layer of oil so thin that the resulting interference patterns let you determine the wavelength of the light.^{4}

The claim about Caesar’s last breath is yet another a story about three length-scales, spanning the logarithmic ladder from molecules to people to planets. How big are these things? The diameter of the planet is about eight million times the height of the average human adult, which in turn is about five billion times the diameter of a molecule of air. With such disparate ratios (eight million versus five billion) it might seem that in the range from single molecule to entire atmosphere we humans are off-center, logarithmically speaking, but that’s because we’re ignoring two important things: gas kinetics and the shape of the atmosphere. Molecules in a gas aren’t packed like oranges at the grocer’s; they’re constantly jostling one another, in an all-against-all molecular melee that results in far fewer molecules per liter than the size of a molecule would suggest. Also, our atmosphere is not a ball of gas but a *hollow* ball, eight thousand miles across from its northernmost point to its southernmost but only ten miles thin; in relative terms, that’s five time thinner than the shell of a chicken’s egg. When you do the math (as Archimedes would have loved to do, given his famous work on the volume and surface area of spheres), you find that the number of molecules of air in a lung is quite close to the number of lungfuls of air on Earth. And this suggests that the number of molecules from Caesar’s last breath in your lungs right now is approximately 1.

The answer is sufficiently close to 1 that it’s probably sensitive to issues ignored in our oversimplified mixing model.^{5}

Jeans’ claim ignores the massive amount of molecular recombination going on in our atmosphere. In the chemical dance of geological and biological processes, oxygen and nitrogen atoms (the primary constituents of air) change partners all the time. It’s conceivable that most of the oxygen molecules in Caesar’s last breath got split long ago, and hence, strictly speaking, no longer exist. Of course, we could rescue Jeans’ claim by replacing molecules by atoms, and then similar calculations would apply.

I myself prefer to go back to the formulation that I read as a child, the ones that talks about Archimedes’ lifetime pulmonary output instead of his dying breath; aside from the fact that it’s less morbid, it’s also much more likely to be true. That extra factor of half a billion, coming from all those breaths, makes the proposition much more certain, even if some of those breaths contained molecular “repeats”, and even if some of the molecules escaped into outer space, or sit sequestered in permafrost, or were cleft by lightning or metabolism.

I suggest Terry Tao’s lecture “The Cosmic Distance Ladder” as a follow-up to “Powers of Ten” and “Cosmic Zoom”. But such pedagogical tools can only go so far to give us a feeling for the power of raising things to powers. An old story from India tells how a grand vizier, having invented the game of chess for the ruler’s enjoyment, asks that his reward be one rice grain for the first square of the board, two grains for the second, four grains for the next, eight grains for the next, and so on, up until the 64th and last square of the board. The king thinks the vizier is letting him off easy and agrees to his terms. Only later, when he starts trying to pay the rice to the vizier, does he discover that he’s made a mistake. How much rice was the vizier asking for? 1+2+4+8+…+2^{63} comes to about 18 quintillion grains of rice, which is a thousand times greater than the amount of rice that is currently grown in the world in a year. In one version of the story, the king, upon realizing that all the rice in his kingdom wouldn’t suffice, nullifies his promise by having the vizier executed.

Even when you think you understand exponential growth, it’s easy to slip up. Here’s an example from The Giant Golden Book of Mathematics, a book I loved as a child and still admire: “An amoeba is placed in an empty jar. After one second, the amoeba splits into two amoebas, each as big as the mother amoeba. After another second, the daughter amoebas split in the same way. As each new generation splits, the number of amoebas and their total bulk doubles each second. In one hour the jar is full. When is it half-full?” It’s tempting to answer “half an hour”, but the correct answer is one second before the hour is up. Actually, an even better answer is “That’s a ridiculous question.” There are 3600 seconds in an hour, and 3600 rounds of doubling would lead to a growth of the initial biomass by a factor of about 10^{1000}. There aren’t enough octaves on the cosmic piano for that. Before the hour is up, the amoebas would fill all of the the known universe.

Those imaginary amoebas teach us something that we forget at our peril: exponentially growing quantities look negligible until they don’t — or look innocuous until it’s too late to do anything about them. Why worry if ten grains of rice become twenty? It’s still less than a handful. Why worry if ten cases of a communicable disease become twenty? More people die each year from falling out of bed.^{6} It’s easy to dismiss things that are growing exponentially when they’re small. Albert Allen Bartlett famously wrote “The greatest shortcoming of the human race is our inability to understand the exponential function.” Let’s hope we as a species can avoid the grand vizier’s fate.

To end on an upbeat (or dare I say “inspiring”?) note, it’s worth remembering that the same solar energy that kindled Archimedes’ brain by way of chemical bonds in the oxygen he breathed also feeds *our* brains. If we focus enough brainpower on the problems we face as a species, it’s possible we’ll be able to come up with ways to cope with the current crisis and stumble our way through to the next crisis, and the next, and the next.

*Thanks to John Baez, BIll Gosper, Sandi Gubin, Hans Havermann, Michael Kleber, Henri Picciotto, Evan Romer, and Simon Plouffe.*

Next month: Confessions of a Conway Groupie.

**ENDNOTES**

#1. The central nervous system must have its own tricks for dealing with the problem of disparate scale; for instance, perceptible levels of loudness, from the barely discernible to the headache-inducing, span many orders of magnitude, as do perceptible levels of illumination. If you know something about how the brain encodes intensity of auditory and visual stimuli, please post in the comments!

#2. Thinking of this piano puts me in mind of a scene from “The 5000 Fingers of Dr. T.“, which as I child I found so disconcerting that I couldn’t watch the movie.

#3. Polyethylene is a chain of hydrogens attached to a carbon backbone of indefinite length, so in principle a polyethylene molecule could be long enough to span the observable universe; this hypothetical molecule, if folded up tightly, would fit inside Hangar 1.

#4. Can anyone provide a good reference for this?

#5. When we’re dealing with quantities much bigger than 1, or much smaller, an order of magnitude or two usually doesn’t have a qualitative effect on the conclusions we can draw, but that’s not the case when quantities are logarithmically close to 1, or 10^{0}. If the expected number of “special” molecules in our lungs at any given time is computed to be around 10^{−2} = .01, then we could say with confidence that most of the time our lungs don’t contain any. On the other hand, if the expected number of special molecules in our lungs at any given time is computed to be around 10^{2} = 100, and we model the number of such molecules in our lungs as a Poisson random variable, theory tells us that the standard deviation is 10, so that the probability that our lungs contain none at all right now — a “ten-sigma event” — is minuscule.

#6. Propagation of a novel disease through a vulnerable population is described pretty well by an exponential function in the early stages of the epidemic, when most of the population is immunologically naive. In later stages of the epidemic, the sigmoid curve predicted by the logistic model provides a better fit.

]]>