Dedicated to the memory of Herb Wilf
Mathematicians celebrate the French thinker René Descartes for inventing Cartesian coordinates.1 But we should also remember him as the person who tilted the terrain of Europe’s mathematical alphabet, using early letters of the alphabet to signify known quantities and imbuing later letters (especially x) with the pungent whiff of the Unknown. If you learned to write quadratic expressions as ax2 + bx + c instead of xa2 + ya + z (and I’m guessing you did), it’s down to Descartes.2
My topic this month is polynomials like ax2 + bx + c. In school math, you first learned about x as an unknown, a number hiding behind a mask. (“What is x? Let’s find out.”) Later you learned to view x as a variable, so that a formula like y = ax2 + bx + c is a function or rule: if you give me an x, I’ll give you a y. (“What is x? No number in particular; x ranges over all real numbers.”) I’ll touch on both points of view today, but I’ll be stressing a viewpoint that’s probably less familiar, where x is neither an unknown nor a variable, but just, well, itself. From this perspective, polynomials appear as number-like objects in and of themselves, with their own habits and mating behavior.
Let’s start with something a bit silly. The Global Math Project website that has a polynomial with your name on it. Literally. Go to http://globalmathproject.org/personal-polynomial/ and type in your name, and the website will give you a mathematical expression that (unless you go by “JJ” or some other moniker whose letters are all the same) will contain one or more occurrences of the variable x; say hello to your Personal Polynomial. What makes it your Personal Polynomial is that if you replace x by the number 1, the expression turns into the numerical value of the 1st letter of your name (where A = 1, B = 2, etc.); if you replace x by the number 2, the expression turns into the numerical value of the 2nd letter of your name; and so on. For instance, when I typed in “JIM” the Personal Polynomial genie gave me the degree-two polynomial (5/2)x2 − (17/2)x + 16; with x = 1, my Personal Polynomial turns into 5/2 − 17/2 + 16 = 10, and sure enough, the 1st letter of my name is the 10th letter of the alphabet, J. Likewise, plugging in x = 2 gives 9 (the numerical value of I) and plugging in x = 3 gives 13 (the numerical value of M). That is, the computer found magic numbers a, b, and c with the property that the polynomial p(x) = ax2 + bx + c satisfies the three equations p(1) = 10, p(2) = 9, and p(3) = 13.
Try it! If you give the program a name with n letters, it’ll reply with a polynomial of degree n−1 or less that does the job.3
When you use the Personal Polynomial website it’s hard to see anything sinister lurking nearby. But algebraic expressions have a seductive slickness that can lead us to invest more faith in them than we should. Although you’d be unlikely to be so extremely silly as to plug x = 4 into my Personal Polynomial and then announce that the fourth letter of “JIM” should be a V (or that the letter after that should be the thirty-sixth letter of the alphabet), certain American policy-makers did something similar during the early days of the Covid epidemic: they fitted a degree-three polynomial to epidemiological data from the past and used it to predict the future.4 You can read more about this in Jordan Ellenberg’s book Shape, though for a tongue-in-cheek demonstration of the pitfalls of extrapolation it’s hard to beat what Mark Twain wrote over a century ago:
In the space of one hundred and seventy-six years the Lower Mississippi has shortened itself two hundred and forty-two miles. That is an average of a trifle over one mile and a third per year. Therefore, any calm person, who is not blind or idiotic, can see that in the Old Oölitiic Silurian Period, just a million years ago next November, the Lower Mississippi River was upwards of one million three hundred thousand miles long, and stuck out over the Gulf of Mexico like a fishing-rod. And by the same token any person can see that seven hundred and forty-two years from now the Lower Mississippi will be only a mile and three-quarters long, and Cairo and New Orleans will have joined their streets together, and be plodding comfortably along under a single mayor and a mutual board of aldermen. There is something fascinating about science. One gets such wholesale returns of conjecture out of such a trifling investment of fact.Mark Twain, “Life on the Mississippi”
POLYNOMIALS SET FREE
Students in algebra classrooms see polynomials in action in all kinds of ways. I already demonstrated substitution when I plugged x = 1 into (5/2)x2 − (17/2)x + 16 and got 10. There’s also the reverse problem of solving for x: if all I tell you about x is that (5/2)x2 − (17/2)x + 16 equals 10, what values might x have? 1 is one such value, but are there others?5 Here x plays the role of the unknown, the not-yet-known, the about-to-be-known, etc. And when you see a formula like y = ax2 + bx + c, chances are you’ll be asked to graph it.
But what I want to describe to you this month are generating functions, in which x is no longer a stand-in for an unknown number; x stands proudly on its own. And it brings along some friends.
In a way, generating functions are reminiscent of the numbers Euler gave us when he admitted a new player i into the number game and saw what new numbers it gave rise to, using no information about i except the equation i2 = −1. The big difference is that, whereas Euler decreed that i2 should equal −1, we won’t decree anything about x2 at all, or x3 or any higher powers (though we will decree that x0 equals 1). We remain agnostic about what x means and just manipulate expressions according to rules, like the rule that says that xa times xb equals xa+b (“Exponents add when you multiply powers”). x has been freed from its obligation to refer to anything outside of itself.
The noted mathematician Herb Wilf6 once wrote that “A generating function is a clothesline on which we hang up a sequence of numbers for display.” Wilf wrote the book on generating functions, literally, but I would like to contest, or at least elaborate upon, the word “display”. Clotheslines aren’t just for display; they’re also useful. And so are generating functions. I’ll give three applications, all of which have to do with dice.
ADDING DICE ROLLS AND MULTIPLYING POLYNOMIALS
Thousands of years ago, our ancestors felt threatened by mighty and seemingly lawless forces in the air above and the earth beneath and attributed them to mighty powers who might be propitiated by sacrifices or whose actions might at least be predicted through suitable forms of divination. (This was before we had degree-three polynomials.) The unpredictability of processes like the casting of lots seemed to have an affinity to the capriciousness of Nature and so were thought to provide a kind of channel to the supernatural powers that controlled both.
The ankle-bone (or “talus”) of a hoofed animal would have seemed like a natural choice of divinatory device: these four-sided objects were plentiful, their origin linked them to life and death, and if you tossed one, it was hard to know which side would face upward when it stopped rolling. It’s theorized that over time such oracular bones evolved into cubical dice. In any case we know that even if dice were invented for divinatory purposes, they became coopted for games of chance quite early; some of the oldest dice archeologists have found are dice that have been tampered with in such a way as to bias the outcome (so-called “crooked” or “gaffed” dice). This might indicate an attempt to affect the hazards of weather or warfare but seems more likely to indicate an attempt to acquire wealth at the expense of others. (Even today, we acknowledge the dare-I-say dicey nature of personal finance by referring to a large sum as a “fortune”.) Dice games were commonplace by the time European mathematicians like Pascal and Fermat laid the groundwork for the modern theory of probability in the 1600s.
Consider a die of the modern kind, with six faces showing the numbers 1 through 6 in the form of dots, or “pips” (where a face with 1 pip signifies the number 1, a face with 2 pips signifies the number 2, and so on). We’ll give this die its own Personal Polynomial (yeah, dice aren’t persons, but you know what I mean): the polynomial x1 + x2 + x3 + x4 + x5 + x6. That is, we take x to the power of each of the numbers shown on the die’s six faces, and we add those powers. (In precollege math it’s usual to write polynomials with the highest-degree term first and the lowest-degree term last, but for generating functions it’s more useful to do the reverse, so that the exponents count up instead of down.)
Why is it useful to associate polynomials with dice in this way? Because (as we’ll see) when we multiply the polynomials associated with two dice, we get useful information about what happens when we roll both dice and add the numbers that they show. Similarly, when we square the polynomial associated with one die, we get useful information about what happens when we roll the die twice and add the two numbers we see.
Before we dive into multiplying a degree-six polynomial by itself, let’s take the simpler example of a two-sided die (better known as a “coin”). Suppose the coin has 1 pip on one side and 2 pips on the other side. What can happen when you toss it twice and record the sum? You might get a 1 followed by a 1, or a 1 followed by a 2, or a 2 followed by a 1, or a 2 followed by a 2. So the total can be 2, 3, 3, or 4. We can make a two-by-two table of the four possibilities, which the alert reader (or even a woozy one) may recognize as a very small addition table that shows the sums you can get when you add 1-or-2 plus 1-or-2.
Since exponents add when you multiply powers, we find a similar pattern when we form a very small multiplication table that shows the products you can get when you multiply x1-or-x2 times x1-or-x2.
It’s no coincidence that these are exactly the terms you get when you multiply x1+x2 by itself and expand using the distributive law (or using “FOIL”, if you insist).
Likewise, imagine a three-sided die with 1 pip, 2 pips, and 3 pips on its three respective sides. We can make a table of all nine possibilities.
Reading the table, we find 1 way to roll a 2, 2 ways to roll a 3, 3 ways to roll a 4, 2 ways to roll a 5, and 1 way to roll a 6. Alternatively, we can multiply x1 + x2 + x3 by itself and collect like terms, obtaining the generating function (x1 + x2 + x3)2 = x2 + 2x3 + 3x4 + 2x5 + x6.
The deluxe version of the distributive law says that if you have one sum of numbers and you multiply it by another sum of numbers, you have to individually multiply each number in the first sum by each number in the second sum and then add up all the resulting products, being careful not to leave any out or include any twice. Meanwhile, a multiplication table is designed to force us to record each possible product once and only once by giving us a well-defined place to record the answer. So the nine terms in the expansion correspond to the nine entries in the small multiplication table.
Now, at this point you should be skeptical of the usefulness of generating functions. If all we care about is the respective numbers of ways to roll a 2, 3, 4, 5, or 6 using two rolls of our three-sided die, then the extra baggage of those plus-signs and powers of x seems like mere typographical distraction from the real message. But suppose we want to know whether two rolls of a six-sided die (the kind of die people actually use in games) is likelier to give us a sum that’s odd or even. We can use generating functions to solve this problem just by thinking about the generating function
p(x) = (x1 + x2 + x3 + x4 + x5 + x6)2 = x2 + 2x3 + … + 2x11 + x12
without actually expanding it out and writing down the intermediate terms.7
NEGATIVE ONE LENDS A HAND
The trick involves looking at p(−1). (This essay is called “Let x equal x”8, but that doesn’t mean I won’t sometimes want to let x equal some specific number like −1.) Actually, let’s first look at what we get when we replace x by −x in p(x). Remember that the polynomial p(x) is x2 + 2x3 + … + 2x11 + x12. So p(−x) is (−x) + 2(−x)3 + · · · + 2(−x)11 + (−x)12 , which equals x2 − 2x3 + … − 2x11 + x12. The coefficients don’t change in magnitude, but every second plus-sign gets flipped to a minus-sign. Notice that the outcomes in which the sum of the two numbers that are rolled is even correspond to terms with even exponent, which get left alone; but the outcomes in which the sum of the two numbers that are rolled is odd correspond to terms with odd exponent, whose sign gets flipped. So now, if you replace x by 1 in p(−x) (which is the same as replacing x by −1 in p(x)), you get the sum 1 − 2 + ··· − 2 + 1 in which the positive terms correspond to the ways to roll an even sum and the negative terms correspond to the ways to roll an odd sum.
That means in the contest between the forces of Even and the forces of Odd, we can assess the balance of power by seeing whether p(−1) is positive (in which case the outcomes with even sum outnumber the outcomes with odd sum) or negative (in which case the outcomes with odd sum outnumber the outcomes with even sum).
And here’s where the deus-ex-algebra descends upon the scene. Remember, p(x) equals (x1 + x2 + x3 + x4 + x5 + x6)2 , so p(−1) equals (−1+1−1+1−1+1)2. But that’s just 02, or 0. So it’s a stand-off between Even and Odd. That is, when you roll two dice, the sum has an equal chance of being even or odd.9
Now you still may be thinking you’d rather not think so hard; you (or an algebraic calculator) could just expand (x1 + x2 + x3 + x4 + x5 + x6)2 as
1x2 + 2x3 + 3x4 + 4x5 + 5x6 + 6x7 + 5x8 + 4x9 + 3x10 + 2x11 + 1x12
and then just check that 1 + 3 + 5 + 5 + 3 + 1 (the sum of the coefficients of the even powers of x) equals 2 + 4 + 6 + 4 + 2 (the sum of the coefficients of the odd powers of x). But the real strength of the more abstract approach lies in how well it scales. Because: The problem I really wanted to ask you involves rolling a six-sided die six times. (Rolling it just twice was only a warm-up.)
So, suppose you roll a six-sided die six times. The generating function for the sum of the six numbers that you roll is (x1 + x2 + x3 + x4 + x5 + x6)6, a polynomial with 31 terms that I would never dream of writing out by hand (or if I did dream about it I’d call it a nightmare). But if you just want to know whether the sum of the six numbers is likelier to be even or odd, you can just plug in x = −1, obtaining (−1+1−1+1−1+1)6, or 0. So once again, an even sum and an odd sum are equally likely.
Challenge problem: Suppose we have a (fair) five-sided die. (It’s easy to use a fair six-sided die to simulate a fair five-sided side: just roll it in the ordinary way, and if the outcome is a 6, look around furtively, announce “That didn’t just happen” and roll it again, and keep at it until you get a 1, 2, 3, 4, or 5.) If you roll your five-sided die five times, do you think the sum is likelier to be odd or even? What if you roll it six times? Generating functions give a quick route to the answer.10
Things get even more fun when more variables come into the game.11
We’ve played with taking powers of the generating function x1 + x2 + x3 + x4 + x5 + x6; now let’s go the other way and factor it.
Just as six jellybeans in a row can be divided into three groups of two or two groups of three, the terms in the sum
x1 + x2 + x3 + x4 + x5 + x6
can be grouped as
(x1 + x2) + (x3 + x4) + (x5 + x6)
(x1 + x2 + x3) + (x4 + x5 + x6).
The former grouping can be written as x0 (x1 + x2) + x2 (x1 + x2) + x4 (x1 + x2), which equals (x0 + x2 + x4) (x1 + x2), while the latter can analogously be written as x0 (x1 + x2 + x3) + x3 (x1 + x2 + x3) which equals (x0 + x3) (x1 + x2 + x3).
These factorizations give us curious ways to simulate a 6-sided die using a 2-sided die and a 3-sided die. (Multiplication of generating functions is the “mating behavior” I mentioned at the beginning of the essay.)
Let’s go back to the equation
(x0 + x2 + x4) (x1 + x2) = x1 + x2 + x3 + x4 + x5 + x6.
If we have a 3-sided die with faces showing 0 pips, 2 pips, and 4 pips (coming from the first factor on the left-hand of the equation) and a 2-sided die with faces showing 1 pip and 2 pips (coming from the second factor on the left-hand of the equation), and we roll them both and record the sum, the six equally likely outcomes are precisely the numbers 1 through 6.
Similarly, consider the equation
(x1 + x2 + x3) (x0 + x3) = x1 + x2 + x3 + x4 + x5 + x6.
It tells us that if we roll a 3-sided die with faces showing 1 pips, 2 pips, and 3 pips and a 2-sided die with faces showing 0 pips and 3 pips, and we roll them both and record the sum, the six equally likely outcomes are again the numbers 1 through 6.
This leads us to reinvent what are called Sicherman dice, devised in 1977 by puzzlemaker George Sicherman12 (though he did not invent them using generating functions). Remember that the generating function for the sum of two ordinary six-sided dice is (x1 + x2 + x3 + x4 + x5 + x6)2 . We can use our two factorizations to write this as
(x0 + x2 + x4) (x1 + x2) × (x1 + x2 + x3) (x0 + x3).
But, swapping factors, we see that this is equal to
(x0 + x2 + x4) (x0 + x3) × (x1 + x2 + x3) (x1 + x2)
or (moving a factor of x from the x-endowed factor x1 + x2 to the x-impoverished factor x0 + x2 + x4)
(x1 + x3 + x5) (x0 + x3) × (x1 + x2 + x3) (x0 + x1).
The first product (x1 + x3 + x5) (x0 + x3) expands as
x1 + x3 + x4 + x5 + x6 + x8
while the second product (x1 + x2 + x3) (x0 + x1) expands as
x1 + x2 + x2 + x3 + x3 + x4.
These two polynomials are the generating functions for Sicherman’s dice: the first has six sides bearing the numbers 1, 3, 4, 5, 6, and 8, while the second has six sides bearing the numbers 1, 2, 2, 3, 3, and 4. Despite the fact that the dice look strange, if you roll each once, the sum of the two numbers you roll is statistically indistinguishable from what you’d get from rolling two ordinary dice. And that’s because when you mate the generating functions of Sicherman’s dice, you’re getting the same polynomial factors that you get when you mate the generating functions of two ordinary dice — you just get them in a different order.13
Sicherman dice were introduced to the world at large through the Mathematical Games column of Martin Gardner, the writer who was hailed in his time as “the best friend mathematics ever had”. I grew up reading many of those columns in the 1970s, and if it hadn’t been for Gardner I would be a very different sort of mathematician, if I were a mathematician at all.
On a YouTube channel created in Gardner’s honor called “Celebration of Mind”, Alexandre Muñiz has a nice video about Sicherman dice and other things; in the video he shows how you can take two 2-sided dice and two 3-sided dice and (pairing them up one way) simulate two ordinary dice and (pairing them up another way) simulate the Sicherman dice.
Muñiz and I have come up with a curious kind of die that I hope someone will fabricate and send me. It’s a clear cube made of resin or acrylic with an opaque tetrahedral die embedded in it, with the four corners of the tetrahedron corresponding to four of the eight corners of the cube. (Or maybe the surrounding cube exists only as a network of twelve struts; we still haven’t decided what physical instantiation works best.) In any case, the tetrahedron inside the cube has four faces with 0, 1, 2, and 4 pips respectively. You can check that the number of pips visible from above the die after you roll it is 1, 2, 3, 4, 5, or 6, just as with an ordinary die, but the total number of pips is only 7 instead of 28.
The tetrahedron was my idea; I thought one would roll it on a creased surface (such as the inside of an open book) so that it always lands on an edge, as described in a recent Riddler Express puzzle at FiveThirtyEight.com. Michael Branicky suggested using a taco holder while Zach Wissner-Gross preferred a ridged gnocchi board. Muñiz had the fantastic idea of embedding the tetrahedron in a cube. I can see it in my head, but I’d like to hold one in my hand!
[Postscript: Reader Dave LeCompte has fabricated such a die! A photo appears below.
The photo includes a penny for scale.]
If you find the math behind Sicherman dice fun, you might ask the question, for what values of n can you design two non-standard n-sided dice with the property that, if you roll both dice and record their sum, the outcome is statistically indistinguishable from what you’d get from rolling two standard n-sided dice? (Here the standard n-sided die has numbers 1 through n on its faces.) We’ve seen that you can do it for n = 6; what other values are possible? Post your ideas in the Comments.
Incidentally, the Romans used two kind of dice: a small six-sided die called a tessera whose sides were marked with the numbers from 1 to 6 (that is, essentially a modern cubical die) and a larger four-sided die called a talus whose sides were marked with the numbers 1, 3, 4, and 6. Do you see a way to “Sicherman-ize” the talus die? That is, do you see how to design two four-sided dice, different from the talus and from each other, with the property that if you roll them together, the distribution of the sum is the same as what you would see if you rolled two talus dice? Post your answer in the Comments.
Not related to dice but definitely related to generating functions is the 2Blue1Brown video “Olympiad-level counting” I recommended last month. If you haven’t watched it yet, now’s the time!
I’ll close with a famous puzzle about cheating at dice; it can be solved with generating functions, though it is a bit more advanced than the other puzzles. The question is, can you design two crooked 6-sided dice with the property that their sum is equally likely to be any of the eleven numbers 2, 3, 4, …, 10, 11, and 12?
By a crooked die, I mean a die in which the six outcomes don’t have an equal chance of occurring. In practice, a really lopsided weighting would be (a) hard to achieve and (b) easy to detect, but since you’re reading this as a math-fan and not as a gambler, we’ll model a crooked die as being determined by any six nonnegative numbers a1, a2, … , a6 that add up to 1, and imagine that those are supposed to be the probabilities of the die landing with the six respective faces facing up.
I want two crooked dice, one associated with the probabilities a1, a2, … , a6 and the other associated with the probabilities b1, b2, … , b6, so that when I roll both dice, the sum of the numbers shown is equally likely to take on each of the eleven possible values between 2 and 12.
To get you started on the problem, I’ll claim (without proof) that the property we’re trying to achieve can be restated in terms of the generating functions
A(x) = a1 x1 + a2 x2 + a3 x3 + a4 x4 + a5 x5 + a6 x6
B(x) = b1 x1 + b2 x2 + b3 x3 + b4 x4 + b5 x5 + b6 x6 .
Specifically, we want to choose numbers a1, a2, … , a6 and b1, b2, … , b6 so that A(x) times B(x) equals (1/11) x2 + (1/11) x3 + … + (1/11) x11 + (1/11) x12. Can you find a way to do it or show that it can’t be done? Are there two generating functions that we can mate to create such an offspring?14
Thanks to Sandi Gubin, Alexandre Muñiz, Bill Ossmann, George Sicherman, James Tanton, and Dan Ullman.
Gary Antonick, “Col. George Sicherman’s Dice”, https://archive.nytimes.com/wordplay.blogs.nytimes.com/2014/06/16/dice-3/
Martin Gardner, “Sicherman Dice, the Kruskal Count and Other Curiosities”, chapter 19 in Penrose Tiles to Trapdoor Ciphers … and the Return of Dr. Matrix.
Alexandre Muñiz, “How to Roll Two Dice”, https://www.youtube.com/watch?v=-aDfFh5YUD8
George Sicherman, “Sicherman Dice”, https://userpages.monmouth.com/∼colonel/sdice.html
[Note: Some of you may have tried to access specific endnotes by clicking on the associated footnote numbers in the main body of the tex, and then been frustrated that this didn’t work. I’m frustrated too! I used to have a way to do this but it doesn’t work in the current version of WordPress. If any of you know a good way to create navigable internal links using the current WordPress implementation of hypertext, please let me know. It’s the year 20-friggin’-22; you shouldn’t have to scroll forward and then scroll back to read the endnotes!]
#1. Others like Pierre Fermat played a role in inventing what are now called Cartesian coordinates, but I don’t want to go down that interesting side-track today.
#2. Technically, Descartes would have written ax2 as axx, reserving exponential notation for powers higher than the 2nd.
#3. How does the mathematical engine lurking behind the website do its magic? You can find out from Tanton’s videos, available through links underneath the Personal Polynomial screen. I don’t know the details of this particular computer program, but I know one way the trick can be done, via the time-honored tactic of breaking a problem into pieces. If we can find three polynomials – call them p1(x), p2(x), and p3(x) – satisfying
p1(1) = 10, p1(2) = 0, and p1(3) = 0,
p2(1) = 0, p2(2) = 9, and p2(3) = 0, and
p3(1) = 0, p3(2) = 0, and p3(3) = 13,
then we can add them together to get a polynomial satisfying
p(1) = 10, p(2) = 9, and p(3) = 13.
If we define q1(x) = (x − 2) (x − 3), then the polynomial q1(x) almost does the job that p1(x) is supposed to do; it satisfies two of the three conditions, specifically, q1(2) = 0 and q1(3) = 0. Before you check this, let me warn you that if you check it by expanding q1(x) as x2 − 5x + 6 and then substituting x = 2 and x = 3, Descartes will turn over in his grave. The right way to see that q1(2) is 0 is to plug x = 2 directly into the product (x − 2) (x − 3); then the first factor is 2 − 2, or 0, so the product must be 0, regardless of what the second factor is. Likewise, the right way to see that q1(3) = 0 is to plug x = 3 directly into the product (x − 2) (x − 3). Unfortunately, q1(1) isn’t 10; it’s (1 − 2) (1 − 3) = 2. But all you have to do to fix that blemish is multiply q1 by 5. If we put p1(x) = 5 q1(x) = 5 (x − 2)(x − 3), we get p1(1) = 10, p1(2) = 0, and p1(3) = 0, just as we wanted. So we’ve found p1(x).
Likewise, we can get p2(x) by starting with q2(x) = (x − 1) (x − 3) and multiplying by a suitable fudge factor (namely −9) to get p2(x) = −9 (x − 1) (x − 3) satisfying p2(1) = 0, p2(2) = 9, and p2(3) = 0. Lastly, p3(x) = (13/2) (x − 1) (x − 2) satisfies p3(1) = 0, p3(2) = 0, and p3(3) = 13. Putting it all together, we form
p(x) = p1(x) + p2(x) + p3(x) = 5 (x − 2) (x − 3) − 9 (x − 1) (x − 3) + (13/2) (x − 1) (x − 2),
which (after you expand and recombine the terms) becomes (5/2)x2 − (17/2)x + 16. This is the famous method of Lagrange interpolation.
#4. What the forecasters did was not quite as dumb as fitting a third-degree polynomial to just four data points. Rather, they took a whole lot of data points and found the third-degree polynomial that fits the data as closely as possible. That’s less dumb, but when you’re crafting national policy in the face of a global health emergency, “less dumb” doesn’t cut it.
#5. You could solve the equation (5/2)x2 − (17/2)x + 16 = 10 by rewriting it as (5/2)x2 − (17/2)x + 6 = 0 and using the quadratic formula, but I prefer to use factoring. Since the left hand side of the equation equals 0 when x = 1, the Factor Theorem tells us (5/2)x2 − (17/2)x + 6 must factor as x − 1 times some linear polynomial ax + b. That is, we should be able to find constants a and b so that (x − 1) (ax + b) expands to give (5/2)x2 − (17/2)x + 6. (Notice how we’ve flipped Descartes’ script: a and b are the unknowns, not x!) Expanding (x − 1) (ax + b) gives ax2 + (b − a) x − b, which is equivalent to (5/2)x2 − (17/2)x + 6 provided that a and b satisfy the three equations a = 5/2, b − a = −17/2, and −b = 6. We can solve the first and last equations to get a = 5/2 and b = −6 and then check that these values satisfy the middle equation as well. So (5/2)x2 − (17/2)x + 6 factors as (x − 1) ((5/2)x − 6), telling us that the second root satisfies (5/2)x − 6 = 0, or (5/2) x = 6, or x = 6 / (5/2) = 12/5.
#6. Wilf was one of the tallest mathematicians I ever met, and his book had one of the shortest titles of any math book I ever read as measured by word-count, but that’s only because he cheated: his book is called generatingfunctionology instead of something more boring like “Theory and applications of generating functions”. You’ll notice that this month I’m only talking about polynomials, but Wilf’s book also talks about power series, which is the topic of next month’s essay. I should mention that the kind of generating functions Wilf focuses on in his book encode (ordered) sequences, whereas the kind I’m writiing about here encode (unordered) sets. Incidentally, Wilf’s name, profession, and stature provided the inspiration for the minor Sesame Street character Herb Wolf.
#7. Wilhelm Gottfried Leibniz, co-inventor of the calculus, asserted in his “Opera Omnia” that when you roll two dice, you’re just as likely to roll a 12 as you are to roll an 11, on the grounds that each outcome can be achieved in exactly one way: the former as 6+6, the latter as 5+6. From this we learn two things: first, that even great mathematicians make mistakes, and second, that Leibniz didn’t spend much time gambling. If Leibniz had spent some of his youth in gambling dens, he would’ve learned (possibly by losing his puffy shirt a few times) that an 11 comes up a lot more often than a 12, and if he’d read the writings of Cardano, Pascal, and Fermat he would have tabulated the 36 equally likely outcomes of rolling two 6-sided dice and he would’ve been able to check that an 11 is exactly twice as likely as a 12. Part of the conceptual difficulty here is that when you roll two dice at the same time, as opposed to rolling a single die twice, it’s harder to see why an 11 corresponds to two separate outcomes. If one die is red and the other is blue, then we can distinguish red-5-and-blue-6 from red-6-and-blue-5, but if the dice are hard to tell apart, then there may not be an easy way for us to tell apart the two outcomes; and if we can’t tell the difference, it’s hard to believe that Tyche, the goddess of chance, should care. But she does!
#8. Yes, I know it’s the name of a Laurie Anderson song.
#9. There are other ways to understand this fact. For instance, take the six-by-six addition table and color entries black if the sum is even and white if the sum is odd. Since each row has three white entries and three black entries, there are equally many white entries as black entries in the table as a whole. Or: notice that we can divide the table into nine 2-by-2 blocks, each of which contains equal numbers of black and white squares.
#10. The generating function for a single roll is x1 + x2 + x3 + x4 + x5, so the generating function for the sum of five rolls is (x1 + x2 + x3 + x4 + x5)5. Plugging in x = −1, we get (−1 + 1 − 1 + 1 − 1)5 = (−1)5 = −1, so the negative coefficients in the expansion collectively overpower (just barely) the positive coefficients; the sum of the five numbers we roll is ever-so-slightly likelier to be odd than even. On the other hand, (−1+1−1+1−1)6 = (−1)6 = +1, so on the sixth roll the balance of power shifts; the sum of six numbers is ever-so-slightly likelier to be even than odd.
#11. So far we’ve looked at generating functions with a single variable x (usually called an indeterminate rather than a variable in this context). But generating functions needn’t be limited to a single indeterminate. Consider for instance the expression (x + y)2, which expands as x2 + 2xy + y2. Writing this as 1x2 + 2xy + 1y2, we find that this polynomial is the generating function for the sequence 1, 2, 1. Likewise (x + y)3 is the generating function for the sequence 1, 3, 3, 1; (x + y)4 is the generating function for the sequence 1, 4, 6, 4, 1; and so on. These sequences are the rows of the famous triangle of binomial coefficients attributed variously to Pingala (India), Yang Hui (China), Omar Khayyam (Iran), Tartaglia (Italy), and Pascal (France), that starts like this:
It’s a curious fact that if you alternately add and subtract the elements in any row of the triangle other than the top row, you end up with zero. This isn’t so surprising when the second entry in a row is an odd number like 3 or 5, because then positive terms and negative terms cancel in an obvious way (as in 1 − 3 + 3 − 1). But it’s less obvious why we should have 1 − 4 + 6 − 4 + 1 = 0 and 1 − 6 + 15 − 20 + 15 − 6 + 1 = 0 and so on. But generating functions once again can help us. Consider (x + y)4 =1x4 + 4x3y + 6x2y2 + 4xy3 + 1y4. Replacing y by −y, we get
(x − y)4 =1x4 − 4x3y + 6x2y2 − 4xy3 + 1y4.
If we now set x = y = 1, the left hand side of the inset equation becomes 04, or 0, while the right hand side becomes 1 − 4 + 6 − 4 + 1, the desired alternating sum. The same trick works for (x − y)n for any positive integer n. (Note, though, that the alternating sum of the entries in the top row of the triangle is 1, not 0. This accords with the fact that there are many mathematical situations, especially in discrete mathematics, where it makes more sense to define 00 to equal 1 rather than 0.) This can be used to show that if you toss a coin one or more times, the probability that the number of heads is even is exactly 1/2, as is the probability that the number of heads is odd.
#12. Sicherman took to styling himself as The Colonel as a joke and many people have referred to him in print as such, mistaking the nickname for a military title.
#13. The polynomial x1 + x2 + x3 + x4 + x5 + x6 is actually a product of four irreducible polynomials (polynomials that cannot be factored further): (x) (1 + x) (1 + x + x2) (1 − x + x2). Combining the x, 1 + x + x2, and 1 − x + x2 gives us the x1 + x3 + x5; pairing up the 1 + x and 1 − x + x2 gives us the x0 + x3, and pairing up the x and 1 + x + x2 gives us the x1 + x2 + x3. It’s a theorem of advanced algebra that factorization of polynomials into irreducibles, like factorization of integers into primes, can be done in only one way. The polynomials 1 + x, 1 + x + x2, and 1 − x + x2 are examples of cyclotomic polynomials; specifically, they are Φ2(x), Φ3(x), and Φ6(x), where Φn(x) is the polynomial whose roots are precisely the primitive nth roots of 1 – that is, the complex numbers z that satisfy zn = 1 but don’t satisfy zd = 1 for any positive integer d < n.
#14. Since A(x) and B(x) are both divisible by x, it’s handy to pull out those factors of x and focus on the 5th degree polynomials A*(x) = A(x)/x and B*(x) = B(x)/x, and to write the equation as A*(x) B*(x) = (1/11) (1 + x + x2 + ··· + x10). A*(x) is a 5th degree polynomial, and every polynomial of odd degree has a real root, so there exists a real number r such that A*(r) = 0, implying that A*(r) B*(r) = 0. If we had A*(x) B*(x) equal to (1/11) (1 + x + x2 + ··· + x10), then, plugging in x = r, we’d get (1/11) (1 + r + r2 + ··· + r10) = 0. Multiplying the equation by 11 and then by 1 − r, we get (after lots of cancellation) 1 − r11 = 0. So r must satisfy 1 − r11 = 0, and the only real number r with that property is r = 1. Since we assumed A*(r) = 0, we must have A*(1) = 0. But A*(1) isn’t 0; in fact it’s a1 + a2 + ··· + a6, which is 1, not 0.
This proof won’t work for n-sided dice when n is odd, but in this case complex numbers can come to our rescue. I’ll take n = 5 for definiteness. We are looking for polynomials A(x) = a1 x1 + a2 x2 + a3 x3 + a4 x4 + a5 x5 and B(x) = b1 x1 + b2 x2 + b3 x3 + b4 x4 + b5 x5 satisfying A(x) B(x) = (1/9) (x2 + x3 + … + x10). As above, switch to A*(x) = a1 x0 + a2 x1 + … + a5 x4 and B*(x) = b1 x0 + b2 x1 + … + b5 x4 satisfying A*(x) B*(x) = (1/9) (x0 + x1 + … + x8). Let ζ be the 9th root of unity exp(2πi/9), so that (1/9) (ζ0 + ζ1 + · · · + ζ8) vanishes. A*(ζ) can’t vanish because it’s a convex combination of the complex numbers ζ0, ζ1, . . . , and ζ4, all of which lie in the upper half-plane. Hence, no dice!