I don’t know which is stranger: the way mathematicians often embrace ideas that at first glance (and later glances!) seem nonsensical, or the way mathematicians often hold obvious truths at arm’s length, scrutinizing them with a skeptical eye and asking “How do we really know it’s true?”

Here’s one such obvious truth. While it’s possible to pack four disks of diameter 1 into a 2-by-2 square as shown in Figure 1, it’s evident that there’s no way to fit *five* disks of diameter 1 into a 2-by-2 square. It seems crazy to ask for a rigorous proof of this; what could be intuitively clearer? And while we’re at it, it seems obvious that you can’t fit more than six disks of diameter 1 into a 2-by-3 rectangle, or more than eight disks of diameter 1 into a 2-by-4 rectangle, and so on (all our disks today will be of diameter 1, so soon I’ll stop saying “of diameter 1”). Aiming for more generality, let’s define *P _{n}* to be the proposition that you can’t pack more than 2

*n*disks of diameter 1 into a 2-by-

*n*rectangle. It’s reasonable to believe that

*P*is true for all positive integers

_{n}*n*, isn’t it?

Likewise, imagine the task of picking a bunch of whole numbers between 1 and *n*, no two of which differ by exactly 3 or exactly 5. In a whimsical math-education vein, we might imagine we’re inviting the numbers to a party, but numbers that differ by 3 or 5 hate each other, and we want to invite as many numbers as we can without inviting two that don’t get along. (I don’t especially like this artificial way of turning math problems into stories but I know that some people find it helpful.) For *n*=10, an example of such a “friendly” set is {1,2,3,9,10}; for *n*=20, an example of a friendly set is {1,2,3,9,10,11,17,18,19}. The way I constructed each of these sets was by marching through the whole numbers from 1 to *n*, successively picking those numbers that get along with all the numbers I’ve already picked. I made a video of what this process looks like:

(If you can’t access the video, here’s how it starts: After I’ve picked 1, 2, and 3, I can’t pick 4, 5, or 6, because each of them differs by 3 from a previously-picked number; and I can’t pick 6, 7, or 8, because each of them differs by 5 from a previously-picked number; but I can pick 9, because it doesn’t differ from any previously-picked number by 3 or 5.)

Let’s define *Q _{n}* to be the proposition that if you follow this systematic procedure for constructing a friendly set of numbers between 1 and

*n*, the set of numbers you obtain won’t just be a

*pretty big*friendly set — it’ll be the

*biggest possible*friendly set of numbers between 1 and

*n*. It’s reasonable to believe that

*Q*is true for all positive integers

_{n}*n*, isn’t it?

In particular, *Q*_{20} seems reasonable, and when I’m teaching a course in discrete mathematics, I sometimes let students believe it for a while. I let the class “prove” to me that it’s impossible to find a friendly set containing more than nine numbers from 1 to 20. I let them feel happy with their proof. And then, I call on the dissenting student whose hand has been waving in the air for several minutes, and she astonishes the rest of the class by exhibiting a friendly set of *ten* numbers between 1 and 20! This leads to several valuable lessons, such as: #1. An intuitively appealing argument is not the same thing as a proof. #2. When you’re trying to solve an optimization problem, the kind of approach that computer scientists call a “greedy” approach does not always lead you to the best solution. And, of course: #3. The majority isn’t always right.

I invite you to think about *Q*_{20}, and to see why it’s false; I predict that your “Aha!” moment will provide pleasure. Alternatively, you can peek at the End Notes.

What about our disk-packing problem? There, too, our intuition leads us astray: *P _{n}* is false once

*n*becomes large enough. For instance, there’s room for 2001 disks in a 2-by-1000 rectangle. We might conclude from this that we should rely less on our instincts, but perhaps the remedy for our error is to have

*more*instincts pulling us in different directions, because the key to the problem is something that bees know: when you’re packing disks in the plane, the hexagonal packing of disks is more efficient than the square packing of disks. (Well, bees know about making compartments, not packing disks, but the pictures are similar.) You can form small fragments of the hexagonal packing in a 2-by-

*n*frame and stick them together in a manner that fills the rectangle more efficiently than the obvious eggs-in-a-carton square packing arrangement. Figure 2 shows what the left end of the packing looks like. For small values of

*n*, the vacancies at the two ends of the rectangle are a big lose, but as

*n*gets larger, the efficiency-gap between the two packings gets narrower and narrower, until finally, when

*n*is in the hundreds, the gap flips, and now, with an ever-increasing lead, the packing shown in Figure 2 becomes the winner in this race.

*P*becomes false, and stays false as

_{n}*n*continues to grow.

So now you can see why a mathematician looking at Figure 1 might want a rigorous way to support the proposition *P*_{2} (“There’s no room for five disks”) that’s more solid than “Hey, it’s obvious; just look at the picture!” One down-to-earth approach that comes to mind is thinking about area. Maybe the total area of five disks of diameter 1 (and radius 1/2) exceeds the area of a 2-by-2 square? Nope; 5 times *π*/4 falls just short of 4 (because *π* falls just short of 3.2). So forget about area; to prove *P*_{2} we’ll need something else. The simplest rigorous proof of *P*_{2} uses a versatile technique called the pigeonhole principle, which (in its simplest form) says that you can’t put *m*+1 envelopes into *m* pigeonholes without at least two of the envelopes ending up in the same pigeonhole.

To prove *P*_{2}, we’ll first use the pigeonhole principle (with *m*, the number of pigeonholes, equal to 4) to show that if you pick five points in a 1-by-1 square, two of them must be at distance √2/2 or less. Divide the 1-by-1 square into four small 1/2-by-1/2 squares, as shown in Figure 3. (Technical aside: points that border two of the small squares are to be considered as belonging to both, and the middle point is to be considered as belonging to all four squares.) If you pick five points in the 1-by-1 square, two of them must belong to the same 1/2-by-1/2 square, so that the distance between the two is at most √2/2 (the length of the diagonal of the small square that contains them). The “pigeonholes” here are the four 1/2-by-1/2 squares, and the “envelopes” are the five points.

We can now prove *P*_{2} using the method of proof by contradiction: we’ll show that the assumption that we can fit five non-overlapping disks into a 2-by-2 square leads to a logical (or rather a geometric) contradiction. Suppose there were a way to fit five non-overlapping disks inside a 2-by-2 square. What can we deduce about the centers of the disks? They can’t be too close to one another: as Figure 4 shows, if two disks of radius 1/2 aren’t overlapping, their centers must be at least distance 1 from each other. (Look at the line segment joining the centers; the two circles divide the segment into a piece of length 1/2 on either end and possibly a piece in the middle, for a total length of at least 1.)

Also notice that each of those five center-points must be at distance at least 1/2 from the boundary of the square. That is, the five centers must lie in the 1-by-1 square sketched with a dashed line in Figure 5. But we saw in the last paragraph that when you have five points in a 1-by-1 square, there must be two of the five that are within distance √2/2 ≈ .71 of each other. That contradicts our earlier conclusion that all five of the center-points are at distance at least 1 from each other. This contradiction proves that a packing of five non-overlapping disks into a 2-by-2 square doesn’t exist. So we have proved *P*_{2}.

Maybe you think the proof is too complicated, but every simpler “proof” that I know has the unfortunate flaw that it also proves *P*_{1000}*, *which is false!

I must confess that in my heart of hearts I do think *P*_{2} is obvious. But, as a mathematician, I try not to heed that inner voice of “true belief”. That’s partly because, throughout my mathematical career, I’ve seen plenty of things that seemed just as obvious as *P*_{2} that turned out to be false, and it’s made me a bit commitment-phobic when it comes to lending credence to mathematical propositions. Also, once I call *P*_{2} obvious, what will it lead to? Will I call *P*_{3} obvious, or *P*_{4}? That way lies madness, or rather, wrongness, because *P*_{1000} is false. I’d rather not start down that road. If I want not to be wrong, then I’d rather err on the side of being too skeptical than on the side of being too credulous. The risk here is that my skepticism may give people the impression that this sort of pedantry is the life-blood of mathematics. Far from it! Pedantry (or, to use less loaded language, rigor) merely provides the skeleton that keeps the whole organism from collapsing. Imagination is the true life-blood of mathematics; it’s what animates the skeleton and enables it to dance.

You can adapt my proof of *P*_{2} to prove *P*_{3}, *P*_{4}, *P*_{5}, *P*_{6}, and *P*_{7} (see the End Notes for details), but the approach stops working there. If any of you find any tweaks that let you prove some cases beyond *P*_{7}, please post to the Comments section! I don’t believe mathematicians have gone very far down this road, since well before *P _{n}* becomes false, the proofs are likely to become unmanageably intricate.

*P*_{2} is not the example I use when I introduce the pigeonhole principle. Instead, I use the *Q*_{20} example. I tell the students “Most of you were convinced that there’s no way to find more than *nine* numbers between 1 and 20, no two of which differ by 3 or 5. Then Barbara showed you a way to find such a set of *ten* numbers. But how do you know Barbara’s method is best? How do you know that, five minutes from now, Barbara or Charlie won’t astound you by finding such a set of *eleven* numbers?” The students offer arguments, but they agree that the arguments aren’t very convincing. For one thing, the arguments that they offer sound suspiciously similar to the ones they used when (incorrectly) proving the false assertion *Q*_{20}.

The discomfort of being mistaken has led the students to divest themselves of the approach that led them into error, and they are now standing in their intellectual underwear. At last they are ready to put on something new. At this point, I draw the following table on the board:

1 6

2 7

3 8

4 9

5 10

11 16

12 17

13 18

14 19

15 20

I point out (or lead the students to notice) that if some set of numbers between 1 and 20 contains no two elements differing by (exactly) 5, such as the set {1,2,3,9,10,11,17,18,19}, then that set can’t contain more than one number in each row of the table. Since the table has only ten rows, a set of numbers between 1 and 20 containing no two numbers differing by 5 can’t contain more than ten numbers in total. (This is an application of the pigeonhole principle with *m*, the number of pigeonholes, equal to 10: the “pigeonholes” are the rows of the table, and the “envelopes” are the numbers in the set we are trying to build.) So neither Barbara nor Charlie nor I nor anyone else will ever find a set of eleven numbers between 1 and 20 no two of whose elements differ by 5, even leaving aside the constraint that no two numbers in the set can differ by 3. (Valuable lesson #4. Sometimes changing a problem — adding or dropping constraints — gives insights that can be applied to the original version of the problem.)

At this point, the pigeonhole principle becomes more than just something that students will accept because the teacher and the textbook author think it’s important; the students feel the need for a rigorous way of proving that the best solution they found to a puzzle (finding a big friendly set of numbers between 1 and 20) is the best solution *anyone* could find.

I’d like more of the lessons I teach to be driven by this sort of intellectual tension. Walter Bagehot may have been over-dramatic when he wrote “One of the greatest pains to human nature is the pain of a new idea”, but there is something uncomfortable about new ideas like the pigeonhole principle; while we teachers can try to make new ideas more comfortable for the students, maybe part of our job is to make the students uncomfortable first, by saying “Here’s a compelling problem that you can’t solve with the tools we’ve given you so far,” or “Here’s a problem that you’ll solve incorrectly if you misapply what you’ve learned so far.”

I don’t want to make students so uncomfortable that they’ll associate math with pain, or worse, shame (and you’ll note that in my *Q*_{20} example, most of the students go for the incorrect answer, so whatever stigma is attached to being wrong gets diluted). But I want a little bit of suspense and discomfort to propel the students forward in their journey toward knowledge. I’m always on the lookout for ways to add this to my teaching. The history of mathematics can play a helpful pedagogical role here, since in many ways the learning process for individual people recapitulates the learning process for the human race as a whole, mistakes and all. Paradoxes too can help us learn; even as they confuse us, they wake us up.

The magician Teller, a former teacher, says similar things about the role of discomfort in both magic and teaching. “Magic doesn’t wash over you like a gentle, reassuring lullaby. In magic, what you see comes into conflict with what you know, and that discomfort creates a kind of energy and a spark that is extremely exciting. That level of participation that magic brings from you by making you uncomfortable is a very good thing. … When I go outside at night and look up at the stars, the feeling that I get is not comfort. The feeling that I get is a kind of delicious discomfort at knowing that there is so much out there that I do not understand and the joy in recognizing that there is enormous mystery, which is not a comfortable thing. This, I think, is the principal gift of education.”

Next month (April 17): The Paintball Party Problem.

*Thanks to John Baez, Nick Baxter, Ron Graham, Sandi Gubin, Richard Guy, Hans Havermann, Dick Hess, and Henri Picciotto for help with this essay.
*

**END NOTES**

Regarding packing disks in strips of width 2: To prove *P _{n}* for small values of

*n*, we start by noticing that in any packing of disks of diameter 1 into a 2-by-

*n*rectangle, the centers of the disks must lie in a concentric 1-by-(

*n*-1) rectangle. We can subdivide this smaller rectangle into 2

*n*tiny rectangles of height 1/2 and length (

*n*-1)/

*n*. As long as

*n*is 7 or less, the diagonal of each of these rectangles is less than 1; this lets us mimic the proof of

*P*

_{2}and apply the pigeonhole principle to conclude that you can’t fit more than 2

*n*disks into a 2-by-

*n*rectangle.

I’m sure *P*_{8} is true as well, but I don’t have a proof. It’s not known for what exact value of *n* the proposition *P _{n}* first fails. That is, we don’t know the smallest value of

*n*for which it’s possible to fit 2

*n*+1 circles into a 2-by-

*n*rectangle. We know that it’s at most 164, thanks to a construction found by Dick Hess. The Hess packing is similar to the packing shown in Figure 2 but includes modifications that make the packing even tighter at the ends.

Our way of proving *P _{n}* for

*n*up through 7 isn’t just an illustration of the power of the pigeonhole principle; it’s also an instance of a widespread mathematical phenomenon called “duality“. We saw two paragraphs ago that the

*feasibility*of covering a 1-by-

*n*rectangle with rectangles of diameter less than 1 implies the

*optimality*of the packing of 2

*n*disks of diameter 1 in a 2-by-

*n*rectangle. Putting it differently, the

*existence*of a certain kind of covering implies the

*nonexistence*of a packing better than the obvious one. In the theory of linear programming, this interplay between feasibility and optimality manifests itself as the duality principle, but the applicability of duality goes beyond what are normally thought of as linear programming problems. In fact, duality is the bedrock on which two recent proofs in the study of packing were built: the proofs that in 8 dimensions and 24 dimensions, the best known (hyper-)sphere-packings aren’t just

*the densest ones we’ve found*, but are

*as dense as possible*. See John Baez’s post E

_{8}is the Best or Erica Klarreich’s Quanta Magazine article “Sphere Packing Solved in Higher Dimensions“.

Regarding “friendly” sets of numbers: To get a friendly set consisting of ten numbers between 1 and 20 (thereby disproving *Q*_{20}), you can take the odd numbers between 1 and 20 (or the even numbers between 1 and 20). You can use the pigeonhole principle to prove that for all *n,* the biggest possible friendly set of numbers between 1 and *n* has size *n*/2 when *n* is even and size (*n*+1)/2 when *n* is odd. For any individual *n*, you could prove this by a brute-force search, since there are only 2* ^{n}* subsets of the set {1,2,3,4,…,

*n*} to check (and you can get by with substantially fewer cases if you use some intelligence). But to prove it for all possible values of

*n*, you need some sort of general argument, such as the one based on the pigeonhole principle.

One pre-reader of this essay, Henri Picciotto, wondered whether one could prove *P*_{2} by brute force (or, as he put it, “trial and error”). Unfortunately, the method of brute force only applies when there are a finite number of cases to check. There are *infinitely* many configurations to consider for proving *P*_{2}; the only way to reduce this set to a finite set of cases would be to lump lots of configurations together in some geometrical fashion. Maybe there’s a way to do this, but it’s unclear to me whether people would find it simpler than the pigeonhole principle proof.

A big selling point of the proof of *P*_{2} that uses the pigeonhole principle is that it scales up nicely, at least up through *P*_{7}; this kind of economy of effort (using a tool more than once) is prized by mathematicians. Murray Klamkin, riffing on a George Pólya quote, said “A mathematician is someone who works hard at being lazy.” That is, a mathematician should invest effort in creating powerful and versatile tools. Or as a sorcerer might put it: Enchant your brooms, and they’ll carry your pails for you!

**REFERENCES**

The question about packing disks in a rectangle of width 2 goes back at least to 1983, when it was raised by Gabor Fejes-Tóth. The first major advances are described in Zoltán Füredi’s article “The Densest Packing of Equal Circles into a Parallel Strip”, Discrete Comput. Geom. 6:95-106 (1991); the text can be viewed online.

You can find out more about the disk packing problem from Elwyn Berlekamp and Joe Buhler’s write-up (which appeared in the Mathematical Sciences Research Institute newsletter in 2003), as well as Richard Hess’ article “Puzzles from Around the World”, which appeared in the book “The Mathematician and Pied Puzzler” (Berlekamp/Rodgers 1999); the question appears on page 61, and the solution appears on page 76. The Wolfram Demonstrations Project also has a demonstration of the paradox, contributed by Karl Scherer and Richard Hess. It shows how a 2-by-164 rectangle can be filled with 329 unit disks.

The quote from Teller is taken from the article “Teaching: Just Like Performing Magic” (The Atlantic, January 21, 2016).

Erich Friedman has a great website (“Erich’s Packing Center”) on packing problems. There are lots of opportunities for recreational mathematicians to make contributions here.

Pingback: Another nice counting problem from Jim Propp | Mike's Math Page

Pingback: A challenge for professional mathematicians | Mike's Math Page

Michael G.Very cool post! A little something I realized while reading it – you actually can prove 5 disks won’t fit in the 2×2 square with a simple area argument. None of the circles can fit any further into the corners than in the diagram, so there is 1-(pi/4) ‘wasted’ space that can never hold part of a circle. The remaining space is 3+pi/4, which is just smaller than 5*pi/4 (since pi>3) and thus cannot hold 5 unit circles. Of course, this technique is too weak to apply to any larger rectangles (5+pi/4 > 7*pi/4).

LikeLike

Pingback: When Not to Expect What You’re Expecting |

Pingback: The Big Internet Math-Off 2019, Group 3 – Jim Propp vs Sophie Carr | The Aperiodical