Believe It, Then Don’t: Toward a Pedagogy of Discomfort

I don’t know which is stranger: the way mathematicians often embrace ideas that at first glance (and later glances!) seem nonsensical, or the way mathematicians often hold obvious truths at arm’s length, scrutinizing them with a skeptical eye and asking “How do we really know it’s true?”


Figure 1. Four disks of diameter 1 packed in a 2-by-2 square.

Here’s one such obvious truth. While it’s possible to pack four disks of diameter 1 into a 2-by-2 square as shown in Figure 1, it’s evident that there’s no way to fit five disks of diameter 1 into a 2-by-2 square. It seems crazy to ask for a rigorous proof of this; what could be intuitively clearer? And while we’re at it, it seems obvious that you can’t fit more than six disks of diameter 1 into a 2-by-3 rectangle, or more than eight disks of diameter 1 into a 2-by-4 rectangle, and so on (all our disks today will be of diameter 1, so soon I’ll stop saying “of diameter 1”). Aiming for more generality, let’s define Pn to be the proposition that you can’t pack more than 2n disks of diameter 1 into a 2-by-n rectangle. It’s reasonable to believe that Pn is true for all positive integers n, isn’t it?

Likewise, imagine the task of picking a bunch of whole numbers between 1 and n, no two of which differ by exactly 3 or exactly 5. In a whimsical math-education vein, we might imagine we’re inviting the numbers to a party, but numbers that differ by 3 or 5 hate each other, and we want to invite as many numbers as we can without inviting two that don’t get along. (I don’t especially like this artificial way of turning math problems into stories but I know that some people find it helpful.) For n=10, an example of such a “friendly” set is {1,2,3,9,10}; for n=20, an example of a friendly set is {1,2,3,9,10,11,17,18,19}. The way I constructed each of these sets was by marching through the whole numbers from 1 to n, successively picking those numbers that get along with all the numbers I’ve already picked. I made a video of what this process looks like:

(If you can’t access the video, here’s how it starts: After I’ve picked 1, 2, and 3, I can’t pick 4, 5, or 6, because each of them differs by 3 from a previously-picked number; and I can’t pick 6, 7, or 8, because each of them differs by 5 from a previously-picked number; but I can pick 9, because it doesn’t differ from any previously-picked number by 3 or 5.)

Let’s define Qn to be the proposition that if you follow this systematic procedure for constructing a friendly set of numbers between 1 and n, the set of numbers you obtain won’t just be a pretty big friendly set — it’ll be the biggest possible friendly set of numbers between 1 and n. It’s reasonable to believe that Qn is true for all positive integers n, isn’t it?

In particular, Q20 seems reasonable, and when I’m teaching a course in discrete mathematics, I sometimes let students believe it for a while. I let the class “prove” to me that it’s impossible to find a friendly set containing more than nine numbers from 1 to 20. I let them feel happy with their proof. And then, I call on the dissenting student whose hand has been waving in the air for several minutes, and she astonishes the rest of the class by exhibiting a friendly set of ten numbers between 1 and 20! This leads to several valuable lessons, such as: #1. An intuitively appealing argument is not the same thing as a proof. #2. When you’re trying to solve an optimization problem, the kind of approach that computer scientists call a “greedy” approach does not always lead you to the best solution. And, of course: #3. The majority isn’t always right.

I invite you to think about Q20, and to see why it’s false; I predict that your “Aha!” moment will provide pleasure. Alternatively, you can peek at the End Notes.

What about our disk-packing problem? There, too, our intuition leads us astray: Pn is false once n becomes large enough. For instance, there’s room for 2001 disks in a 2-by-1000 rectangle. We might conclude from this that we should rely less on our instincts, but perhaps the remedy for our error is to have more instincts pulling us in different directions, because the key to the problem is something that bees know: when you’re packing disks in the plane, the hexagonal packing of disks is more efficient than the square packing of disks. (Well, bees know about making compartments, not packing disks, but the pictures are similar.) You can form small fragments of the hexagonal packing in a 2-by-n frame and stick them together in a manner that fills the rectangle more efficiently than the obvious eggs-in-a-carton square packing arrangement. Figure 2 shows what the left end of the packing looks like. For small values of n, the vacancies at the two ends of the rectangle are a big lose, but as n gets larger, the efficiency-gap between the two packings gets narrower and narrower, until finally, when n is in the hundreds, the gap flips, and now, with an ever-increasing lead, the packing shown in Figure 2 becomes the winner in this race. Pn becomes false, and stays false as n continues to grow.


Figure 2. A better way to pack disks in a long strip. The dashed lines group the disks into triplets that make the pattern clearer.

So now you can see why a mathematician looking at Figure 1 might want a rigorous way to support the proposition P2 (“There’s no room for five disks”) that’s more solid than “Hey, it’s obvious; just look at the picture!” One down-to-earth approach that comes to mind is thinking about area. Maybe the total area of five disks of diameter 1 (and radius 1/2) exceeds the area of a 2-by-2 square? Nope; 5 times π/4 falls just short of 4 (because π falls just short of 3.2). So forget about area; to prove P2 we’ll need something else. The simplest rigorous proof of P2 uses a versatile technique called the pigeonhole principle, which (in its simplest form) says that you can’t put m+1 envelopes into m pigeonholes without at least two of the envelopes ending up in the same pigeonhole.


Figure 3. Five points in a 1-by-1 square can’t all be far from one another; two of them must be close.

To prove P2, we’ll first use the pigeonhole principle (with m, the number of pigeonholes, equal to 4) to show that if you pick five points in a 1-by-1 square, two of them must be at distance √2/2 or less. Divide the 1-by-1 square into four small 1/2-by-1/2 squares, as shown in Figure 3. (Technical aside: points that border two of the small squares are to be considered as belonging to both, and the middle point is to be considered as belonging to all four squares.) If you pick five points in the 1-by-1 square, two of them must belong to the same 1/2-by-1/2 square, so that the distance between the two is at most √2/2 (the length of the diagonal of the small square that contains them). The “pigeonholes” here are the four 1/2-by-1/2 squares, and the “envelopes” are the five points.


Figure 4. The centers of two disjoint disks of radius 1/2 can’t be too close.

We can now prove P2 using the method of proof by contradiction: we’ll show that the assumption that we can fit five non-overlapping disks into a 2-by-2 square leads to a logical (or rather a geometric) contradiction. Suppose there were a way to fit five non-overlapping disks inside a 2-by-2 square. What can we deduce about the centers of the disks? They can’t be too close to one another: as Figure 4 shows, if two disks of radius 1/2 aren’t overlapping, their centers must be at least distance 1 from each other. (Look at the line segment joining the centers; the two circles divide the segment into a piece of length 1/2 on either end and possibly a piece in the middle, for a total length of at least 1.)


Figure 5. When you pack disks in a 2-by-2 square, the centers of the disks all lie in a 1-by-1 square.

Also notice that each of those five center-points must be at distance at least 1/2 from the boundary of the square. That is, the five centers must lie in the 1-by-1 square sketched with a dashed line in Figure 5. But we saw in the last paragraph that when you have five points in a 1-by-1 square, there must be two of the five that are within distance √2/2 ≈ .71 of each other. That contradicts our earlier conclusion that all five of the center-points are at distance at least 1 from each other. This contradiction proves that a packing of five non-overlapping disks into a 2-by-2 square doesn’t exist. So we have proved P2.

Maybe you think the proof is too complicated, but every simpler “proof” that I know has the unfortunate flaw that it also proves P1000, which is false!

I must confess that in my heart of hearts I do think P2 is obvious. But, as a mathematician, I try not to heed that inner voice of “true belief”. That’s partly because, throughout my mathematical career, I’ve seen plenty of things that seemed just as obvious as P2 that turned out to be false, and it’s made me a bit commitment-phobic when it comes to lending credence to mathematical propositions. Also, once I call P2 obvious, what will it lead to? Will I call P3 obvious, or P4? That way lies madness, or rather, wrongness, because P1000 is false. I’d rather not start down that road. If I want not to be wrong, then I’d rather err on the side of being too skeptical than on the side of being too credulous. The risk here is that my skepticism may give people the impression that this sort of pedantry is the life-blood of mathematics. Far from it! Pedantry (or, to use less loaded language, rigor) merely provides the skeleton that keeps the whole organism from collapsing. Imagination is the true life-blood of mathematics; it’s what animates the skeleton and enables it to dance.

You can adapt my proof of P2 to prove P3, P4, P5, P6, and P7 (see the End Notes for details), but the approach stops working there. If any of you find any tweaks that let you prove some cases beyond P7, please post to the Comments section! I don’t believe mathematicians have gone very far down this road, since well before Pn becomes false, the proofs are likely to become unmanageably intricate.

P2 is not the example I use when I introduce the pigeonhole principle. Instead, I use the Q20 example. I tell the students “Most of you were convinced that there’s no way to find more than nine numbers between 1 and 20, no two of which differ by 3 or 5. Then Barbara showed you a way to find such a set of ten numbers. But how do you know Barbara’s method is best? How do you know that, five minutes from now, Barbara or Charlie won’t astound you by finding such a set of eleven numbers?” The students offer arguments, but they agree that the arguments aren’t very convincing. For one thing, the arguments that they offer sound suspiciously similar to the ones they used when (incorrectly) proving the false assertion Q20.

The discomfort of being mistaken has led the students to divest themselves of the approach that led them into error, and they are now standing in their intellectual underwear. At last they are ready to put on something new. At this point, I draw the following table on the board:

1   6
2   7
3   8
4   9
5 10
11 16
12 17
13 18
14 19
15 20

I point out (or lead the students to notice) that if some set of numbers between 1 and 20 contains no two elements differing by (exactly) 5, such as the set {1,2,3,9,10,11,17,18,19}, then that set can’t contain more than one number in each row of the table. Since the table has only ten rows, a set of numbers between 1 and 20 containing no two numbers differing by 5 can’t contain more than ten numbers in total. (This is an application of the pigeonhole principle with m, the number of pigeonholes, equal to 10: the “pigeonholes” are the rows of the table, and the “envelopes” are the numbers in the set we are trying to build.) So neither Barbara nor Charlie nor I nor anyone else will ever find a set of eleven numbers between 1 and 20 no two of whose elements differ by 5, even leaving aside the constraint that no two numbers in the set can differ by 3. (Valuable lesson #4. Sometimes changing a problem — adding or dropping constraints — gives insights that can be applied to the original version of the problem.)

At this point, the pigeonhole principle becomes more than just something that students will accept because the teacher and the textbook author think it’s important; the students feel the need for a rigorous way of proving that the best solution they found to a puzzle (finding a big friendly set of numbers between 1 and 20) is the best solution anyone could find.

I’d like more of the lessons I teach to be driven by this sort of intellectual tension. Walter Bagehot may have been over-dramatic when he wrote “One of the greatest pains to human nature is the pain of a new idea”, but there is something uncomfortable about new ideas like the pigeonhole principle; while we teachers can try to make new ideas more comfortable for the students, maybe part of our job is to make the students uncomfortable first, by saying “Here’s a compelling problem that you can’t solve with the tools we’ve given you so far,” or “Here’s a problem that you’ll solve incorrectly if you misapply what you’ve learned so far.”

I don’t want to make students so uncomfortable that they’ll associate math with pain, or worse, shame (and you’ll note that in my Q20 example, most of the students go for the incorrect answer, so whatever stigma is attached to being wrong gets diluted). But I want a little bit of suspense and discomfort to propel the students forward in their journey toward knowledge. I’m always on the lookout for ways to add this to my teaching. The history of mathematics can play a helpful pedagogical role here, since in many ways the learning process for individual people recapitulates the learning process for the human race as a whole, mistakes and all. Paradoxes too can help us learn; even as they confuse us, they wake us up.

The magician Teller, a former teacher, says similar things about the role of discomfort in both magic and teaching. “Magic doesn’t wash over you like a gentle, reassuring lullaby. In magic, what you see comes into conflict with what you know, and that discomfort creates a kind of energy and a spark that is extremely exciting. That level of participation that magic brings from you by making you uncomfortable is a very good thing. … When I go outside at night and look up at the stars, the feeling that I get is not comfort. The feeling that I get is a kind of delicious discomfort at knowing that there is so much out there that I do not understand and the joy in recognizing that there is enormous mystery, which is not a comfortable thing. This, I think, is the principal gift of education.”

Next month (April 17): The Paintball Party Problem.

Thanks to John Baez, Nick Baxter, Ron Graham, Sandi Gubin, Richard Guy, Hans Havermann, Dick Hess, and Henri Picciotto for help with this essay.


Regarding packing disks in strips of width 2: To prove Pn for small values of n, we start by noticing that in any packing of disks of diameter 1 into a 2-by-n rectangle, the centers of the disks must lie in a concentric 1-by-(n-1) rectangle. We can subdivide this smaller rectangle into 2n tiny rectangles of height 1/2 and length (n-1)/n. As long as n is 7 or less, the diagonal of each of these rectangles is less than 1; this lets us mimic the proof of P2 and apply the pigeonhole principle to conclude that you can’t fit more than 2n disks into a 2-by-n rectangle.

I’m sure P8 is true as well, but I don’t have a proof. It’s not known for what exact value of n the proposition Pn first fails. That is, we don’t know the smallest value of n for which it’s possible to fit 2n+1 circles into a 2-by-n rectangle. We know that it’s at most 164, thanks to a construction found by Dick Hess. The Hess packing is similar to the packing shown in Figure 2 but includes modifications that make the packing even tighter at the ends.

Our way of proving Pn for n up through 7 isn’t just an illustration of the power of the pigeonhole principle; it’s also an instance of a widespread mathematical phenomenon called “duality“. We saw two paragraphs ago that the feasibility of covering a 1-by-n rectangle with rectangles of diameter less than 1 implies the optimality of the packing of 2n disks of diameter 1 in a 2-by-n rectangle. Putting it differently, the existence of a certain kind of covering implies the nonexistence of a packing better than the obvious one. In the theory of linear programming, this interplay between feasibility and optimality manifests itself as the duality principle, but the applicability of duality goes beyond what are normally thought of as linear programming problems. In fact, duality is the bedrock on which two recent proofs in the study of packing were built: the proofs that in 8 dimensions and 24 dimensions, the best known (hyper-)sphere-packings aren’t just the densest ones we’ve found, but are as dense as possible. See John Baez’s post E8 is the Best or Erica Klarreich’s Quanta Magazine article “Sphere Packing Solved in Higher Dimensions“.

Regarding “friendly” sets of numbers: To get a friendly set consisting of ten numbers between 1 and 20 (thereby disproving Q20), you can take the odd numbers between 1 and 20 (or the even numbers between 1 and 20). You can use the pigeonhole principle to prove that for all n, the biggest possible friendly set of numbers between 1 and n has size n/2 when n is even and size (n+1)/2 when n is odd. For any individual n, you could prove this by a brute-force search, since there are only 2n subsets of the set {1,2,3,4,…,n} to check (and you can get by with substantially fewer cases if you use some intelligence). But to prove it for all possible values of n, you need some sort of general argument, such as the one based on the pigeonhole principle.

One pre-reader of this essay, Henri Picciotto, wondered whether one could prove P2 by brute force (or, as he put it, “trial and error”). Unfortunately, the method of brute force only applies when there are a finite number of cases to check. There are infinitely many configurations to consider for proving P2; the only way to reduce this set to a finite set of cases would be to lump lots of configurations together in some geometrical fashion. Maybe there’s a way to do this, but it’s unclear to me whether people would find it simpler than the pigeonhole principle proof.

A big selling point of the proof of P2 that uses the pigeonhole principle is that it scales up nicely, at least up through P7; this kind of economy of effort (using a tool more than once) is prized by mathematicians. Murray Klamkin, riffing on a George Pólya quote, said “A mathematician is someone who works hard at being lazy.” That is, a mathematician should invest effort in creating powerful and versatile tools. Or as a sorcerer might put it: Enchant your brooms, and they’ll carry your pails for you!


The question about packing disks in a rectangle of width 2 goes back at least to 1983, when it was raised by Gabor Fejes-Tóth. The first major advances are described in Zoltán Füredi’s article “The Densest Packing of Equal Circles into a Parallel Strip”, Discrete Comput. Geom. 6:95-106 (1991); the text can be viewed online.

You can find out more about the disk packing problem from Elwyn Berlekamp and Joe Buhler’s write-up (which appeared in the Mathematical Sciences Research Institute newsletter in 2003), as well as Richard Hess’ article “Puzzles from Around the World”, which appeared in the book “The Mathematician and Pied Puzzler” (Berlekamp/Rodgers 1999); the question appears on page 61, and the solution appears on page 76. The Wolfram Demonstrations Project also has a demonstration of the paradox, contributed by Karl Scherer and Richard Hess. It shows how a 2-by-164 rectangle can be filled with 329 unit disks.

The quote from Teller is taken from the article “Teaching: Just Like Performing Magic” (The Atlantic, January 21, 2016).

Erich Friedman has a great website (“Erich’s Packing Center”) on packing problems. There are lots of opportunities for recreational mathematicians to make contributions here.

9 thoughts on “Believe It, Then Don’t: Toward a Pedagogy of Discomfort

  1. Pingback: Another nice counting problem from Jim Propp | Mike's Math Page

  2. Pingback: A challenge for professional mathematicians | Mike's Math Page

  3. Michael G.

    Very cool post! A little something I realized while reading it – you actually can prove 5 disks won’t fit in the 2×2 square with a simple area argument. None of the circles can fit any further into the corners than in the diagram, so there is 1-(pi/4) ‘wasted’ space that can never hold part of a circle. The remaining space is 3+pi/4, which is just smaller than 5*pi/4 (since pi>3) and thus cannot hold 5 unit circles. Of course, this technique is too weak to apply to any larger rectangles (5+pi/4 > 7*pi/4).

    Liked by 1 person

  4. Pingback: When Not to Expect What You’re Expecting |

  5. Pingback: The Big Internet Math-Off 2019, Group 3 – Jim Propp vs Sophie Carr | The Aperiodical

  6. Pingback: Sphere-Packing |

  7. Pingback: Digging through old bookmarks | MIT Admissions

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s