When Five Isn’t Prime

How far would you go to save a theorem? Would you invent a new kind of number? That’s what the mid-19th century German mathematician Ernst Eduard Kummer did, and while he was partly driven by the hope of proving Fermat’s Last Theorem, that wasn’t actually the theorem he was trying to save.

Most readers of this essay will already be familiar with Fermat’s Last Theorem (“FLT” for short), but everyone has a first time learning about FLT, and this essay may be yours, so I’ll remind/inform you that Fermat’s Last Theorem is the infamous assertion that if n is some positive integer bigger than 2, then the relation an + bn = cn can’t have a solution in which a, b, and c are non-zero integers.1 Fermat probably proved the claim for n = 4, and later mathematicians proved it for other specific values of the exponent n (focussing on prime exponents, since if you can rule out all the primes bigger than 2 as possible exponents, the general assertion will follow2). FLT is indeed a theorem now (thanks to Andrew Wiles, Richard Taylor, Ken Ribet and many others) but it technically shouldn’t be called Fermat’s theorem because it’s unlikely that Fermat proved it (even though he claimed he’d found a proof and a marvelous one at that); see my essay The Curious Incident of the Boasting Frenchman.

What’s indisputably true is that in March of 1847, roughly two centuries after Fermat made his boast, the French mathematician Gabriel Lamé underwent the mathematician’s equivalent of finding oneself suddenly naked in public: he sketched a proposed proof of Fermat’s claim to the Paris Academy only to have his idea publicly shot down minutes later by a colleague he admired. The colleague was Joseph Liouville, whose work Lamé had acknowledged in his presentation as a source of inspiration. As if that wasn’t enough of a blow, Liouville suggested that Lamé’s approach wasn’t just wrong; it was also a rather obvious thing to try. Ouch! Liouville may have been unkind but he was right; Leonhard Euler had made his own version of Lamé’s mistake a century earlier in a failed proof of the special case n = 3 of FLT.3

OLD AND NEW MISTAKES

Lamé had the idea that one could approach the equation an + bn = cn by factoring both sides. The right side of the equation already comes to us in the form of a product (c times c times … times c), but the left side is trickier. Wouldn’t it be nice if the left side, like the right side, could be written as n factors multiplied together? Good news: you can do it if you allow complex numbers! If you let ζ1, ζ2, …, ζn be the n complex nth roots of 1, where n is odd, then you can rewrite an + bn as (a + ζ1b) (a + ζ2b) ··· (a + ζnb).4 Numbers expressible in terms of ζ1ζ2, …, ζn using addition, subtraction, and multiplication are called cyclotomic numbers.5

Lamé exploited a sort of tension between the two factorizations, and proved (or rather thought he’d proved) that the only way to resolve that tension when ab, and c are integers is to have one of the three integers equal 0, vindicating Fermat’s claim. What do I mean by “tension”? Consider the fact that 36 can be written both as 4 times 9 and as 6 times 6. If 4, 6, and 9 were primes, the equation 4×9 = 6×6 would cause us consternation, because the Fundamental Theorem of Arithmetic (taken for granted by Euclid over two thousand years ago, stated in the 14th century by Kamal al-Din al-Faris, and rigorously proved in the 19th century by Carl Friedrich Gauss) says that if we ignore the order in which factors are written, every positive integer can be written as a product of primes in just one way.6 Fortunately, none of the numbers 4, 6, 9 are prime; each can itself be written as a product. 4 is 2×2, 6 is 2×3, and 9 is 3×3. The unique way of writing 36 as a product of primes (unique if we ignore the order in which the factors appear) is 2×2×3×3. If we group the factors as (2×2)×(3×3) we get 4×9; if we re-order the factors and group them as (2×3)×(2×3), we get 6×6. The primes 2 and 3 were present all along when we wrote 36 as 4×9 and as 6×6; they were just hiding under the surface.

Lamé’s argument hinged on writing numbers as products of primes, except that the numbers he was factoring were cyclotomic numbers so the primes he needed would have to be cyclotomic numbers too. Gauss had done some work along these lines. He’d studied the arithmetic of what are now called Gaussian integers: complex numbers of the form a + bi where a and b are ordinary integers and i is sqrt(–1). In this arithmetic, 5 can be written as (2 + i) × (2 – i), so in the world of Gaussian integers, the number 5, which we all learned in school is prime, suddenly becomes composite; 3, on the other hand, stubbornly refuses to factor. So factoring numbers in the Gaussian integers is sort of a bizarro version of factoring in the ordinary integers: some of the primes stay prime, but others split into smaller pieces. Gauss didn’t think it was obvious that his Fundamental Theorem of Arithmetic would still apply to Gaussian integers, so he took the trouble of proving that it did. Here’s a picture of (a finite excerpt of) the Gaussian integers. You can see it as the set of vertices in a tiling of the plane by 1-by-1 squares.

The number system we call the Gaussian integers isn’t the only bizarro version of the system of ordinary integers; there are infinitely many. For instance, instead of numbers of the form a + bi, we could look at numbers of the form a + where a and b are ordinary integers and ω is (1 + sqrt(–3))/2. Numbers of this form are called Eisenstein integers. In the Eisenstein integers, 3 is composite (it factors as (2 + ω) × (2 + ω2)) but 5 is prime (or rather, as mathematicians prefer to say nowadays, 5 is irreducible). Here’s a picture of the Eisenstein integers. You can see it as the set of vertices in a tiling of the plane by equilateral triangles of side-length 1.

The study of number systems like the Gaussian integers and the Eisenstein integers and their kin is called algebraic number theory. Algebraic number theorists study number systems that nowadays are called number rings (though this term didn’t come into currency until well after Lamé and Liouville’s time). Number rings are sets of complex numbers that contain all the ordinary integers and are closed under addition, subtraction and multiplication. The ordinary integers form a number ring, the Gaussian integers form a number ring, the Einsenstein integers form a number ring, and so on. These last two are examples of the cyclotomic number rings that I mentioned earlier and whose central feature I’ll remind you of: they are number rings that contain all the complex nth roots of 1 for some particular n, along with all the complex numbers you can build from them using addition, subtraction, and multiplication (but no other numbers). The Gaussian integers form the cyclotomic number ring for n = 4 (since i4 = 1) and the Eisenstein integers form the cyclotomic number ring for n = 3 (since ω3 = 1). These are the right kind of number systems to use if we want to have a chance of making Lamé’s approach work; they have enough numbers in them that an expression like an + bn can be written as a product of n factors, but not so many numbers that the whole concept of being irreducible breaks down.7

Lamé didn’t make the mistake of letting too many numbers into his party; his mistake lay in trusting that the numbers he’d invited would behave themselves – by which I mean, trusting that they’d all act the way the number 36 acts in the ordinary integers. We saw that two different ways of factoring 36 could be reconciled via the factorization of 36 into primes. Lamé assumed that this kind of reconciliation would always apply in all number rings. He was wrong, and there’s a simple counterexample situated in the number ring consisting of all numbers of the form a + b sqrt(–5) where a and b are integers.8 In this number ring (shown below) the numbers 3, 2+sqrt(–5), and 2–sqrt(–5) are all irreducible, yet we have

(2+sqrt(–5)) × (2–sqrt(–5)) = (3) × (3).

What to do?

Unbeknownst to Lamé and Liouville, a German mathematician had found the answer three years earlier.

BREAKING THINGS DOWN

If you’d been a teacher at the Liegnitz Gymnasium (in modern terms, a preparatory high school) in the late 1830s, you might have predicted that Ernst Eduard Kummer, a popular teacher of math and physics who’d completed a Ph.D. before embarking on his teaching career, was destined to do great things outside as well as inside the classroom. Not only did he manage to find time for research but he also mentored some of his teenaged students in conducting research even before they entered university. (One of these students was a young fellow named Leopold Kronecker; remember that name.)

Kummer’s work came to the attention of the two great mathematical Gustavs of the 19th century: Carl Gustav Jacob Jacobi and Johann Peter Gustav Lejeune Dirichlet.9 Jacobi set about getting Kummer a university position, while Dirichlet arranged to get Kummer elected to the Berlin Academy. Dirichlet also played a role in Kummer’s personal life, introducing the high school teacher to the woman he would marry in 1840 (a cousin of Dirichlet’s wife).

In 1842, Kummer was appointed to a full professorship at the University of Breslau (now Wroclaw in Poland). A few years later, he got interested in an 1839 article by Jacobi on higher reciprocity laws in number theory (a fascinating and important topic that I’ll treat as a black box) and began to do research in this area. In these reciprocity laws, a crucial role is played by the cyclotomic number rings we met earlier, the rings that Lamé would use a few years later in his hapless attempt to prove FLT.

Kummer noticed that while these number rings admitted a notion of primality (everything can be broken down into irreducible elements), some of the rings featured numbers that could be broken down into irreducible elements in more than one way. It seemed that Gauss’ Fundamental Theorem of Arithmetic, valid for the ordinary integers and the Gaussian integers (as well as the Eisenstein integers and other scattered number-emporia in the far-flung algebraic number theory franchise), failed to work if you tried to push it too far. Some mathematicians might have shrugged and said “I guess that’s just the way things are.” But Kummer couldn’t let go of trying to extend the Fundamental Theorem of Arithmetic to those resistant number rings. The Theorem worked so beautifully for ordinary integers and it had so many important consequences; he felt there had to be a way to rescue it. He wondered if the irreducible elements in the unreconciled factorizations could actually be broken down further if one included new sorts of entities that he called “ideal complex numbers”.10

Kummer found an approach that worked. Even though he couldn’t say what his ideal complex numbers “were”, he could give rules for how to multiply them and how to relate them to ordinary complex numbers, and with their aid he was able to establish that in many complex number rings (including many of the ones that Lamé was interested in), every element of the ring was associated with an ideal complex number, and this ideal complex number factored in a unique way into primes – though these “primes” were no longer true numbers but rather ideal numbers.

Let’s go back to our (2 + sqrt(−5)) × (2 − sqrt(−5)) = 3 × 3 example. Kummer’s theory led him to write 2 + sqrt(−5) as the square11 of a certain ideal divisor, which I’ll call 𝖆 (it’s traditional to use a Gothic font in this context). Likewise, Kummer wrote 2 − sqrt(−5) as the square of a different ideal divisor, which I’ll call 𝖇. Lo and behold, in Kummer’s theory 𝖆 × 𝖇 equals 3, so the equation (2 + sqrt(−5)) × (2 − sqrt(−5)) = 3 × 3 that formerly seemed to defy the unique factorization property could now be reconciled as (𝖆2) (𝖇2) = (𝖆𝖇)2 . The Fundamental Theorem of Arithmetic could be rescued if instead of factoring numbers one factored the associated ideals.

I’m tempted to compare Kummer’s postulation of ideal divisors to physicist Murray Gell-Mann’s postulation of the existence of quarks. Just as quarks never appear in isolation but only in combination with other quarks, the ideals 𝖆 and 𝖇 can only be “observed” indirectly when they combine to give good old-fashioned complex numbers like 2 + sqrt(−5).

Kummer’s work wasn’t perfect – it contained at least one major mistake (see the Edwards article listed in the References) – but the overall approach was sound, and he published a paper on it in 1844. It didn’t attract much attention (the journal he’d chosen was an obscure one), so in 1846, he published a notice in the proceedings of the Berlin Academy, announcing that he had devised an entirely new theory of ideal complex numbers for the purpose of saving the unique factorization theorem of Gauss in this new context and calling his colleagues’ attention to the 1844 paper. However, neither his 1844 paper nor his 1846 notice made any mention of Fermat’s Last Theorem.

Then, in 1847, the whole Lamé debacle occurred. Liouville and the esteemed mathematician Augustin-Louis Cauchy hoped to make some progress using Lamé’s basic approach, but non-uniqueness of factorization was a road-block. Lamé, Liouville, and Cauchy weren’t familiar with Kummer’s work, but Kummer’s cousin-in-law Dirichlet, whose own mathematical debut before the Paris Academy back in 1825 had involved Fermat’s Last Theorem, knew about Kummer’s paper. Dirichlet contacted Kummer to let him know that the mathematicians of Paris were struggling with matters that Kummer had already mastered. Later that spring, Kummer sent Liouville a copy of the 1844 paper that described his theory of ideal complex numbers, with a cover letter that made it clear that Kummer had already figured out everything Lamé had, and more; not only had Kummer seen the obstacle that Liouville had pointed out, he had found a way to push past it, at least for some values of the exponent n. He wrote: “The applications of this theory to the proof of Fermat’s theorem have occupied me for a long time, and I have succeeded in making the impossibility of the equation12 xn − yn = zn depend on two properties of the prime number n, so that all that remains is to find whether they apply to all the prime numbers.” His cover letter also suggested that Liouville have a look at the doctoral dissertation of Kummer’s disciple Kronecker.

In 1850, the Paris Academy offered a prize for the proof of Fermat’s Last Theorem. No proof was forthcoming. In the meantime, Kummer had gone on to show that all the primes up to 37 satisfied his two conditions, but that 37, alas, did not. Nowadays we say that primes satisfying Kummer’s two conditions are “regular” primes. Kummer’s work showed that if the odd prime p is regular, then Fermat’s Last Theorem is true for the exponent p. We know nowadays that there are infinitely many regular primes, and there are both empirical and theoretical reasons to believe that over 60 percent of the primes are regular (see the Keith Conrad essay in the References). So you could say that Kummer more than half-solved the problem that had so vexed the mathematical community.

Although Kummer never submitted his work to the Paris Academy, in 1857 the Academy gave him the prize anyway. Even if his true goal had been to extend Gauss’ theorem, not to prove Fermat’s claim, Kummer’s was the most successful assault on FLT that had been launched since Fermat came up with the problem two centuries earlier.

“WHAT ARE YOU TALKING ABOUT?”

Richard Dedekind, twenty younger than Kummer, found Kummer’s approach to ideal divisors frustrating. Kummer had given the world a way to work with ideal divisors, and Kronecker had pushed Kummer’s approach farther, but neither Kummer nor Kronecker had said what ideal divisors were. This frustrated Dedekind; he thought that Kummer and Kronecker’s indirect way of reasoning about ideal divisors could lead the insufficiently cautious to “hasty conclusions and incomplete proofs”. As he wrote in 1871, imagining what the Kummer/Kronecker theory of divisors might look like if perfected and completed by future researchers, “Even if there were such a theory, based on calculation, it still would not be of the highest degree of perfection, in my opinion. It is preferable, as in the modern theory of functions, to seek proofs based immediately on fundamental characteristics, rather than on calculation, and indeed to construct the theory in such a way that it is able to predict the results of calculations.”

If we step back from the details of Dedekind’s way of concretizing ideal divisors and consider it from a philosophical standpoint, the approach he took is strikingly reminiscent of his approach to making sense of real numbers. Since I didn’t explain Dedekind’s approach when I wrote my essay “Things, Names, and Numbers” (choosing to describe Cantor’s construction instead), let me summarize Dedekind’s construction here, before I say how Dedekind rephrased Kummer and Kronecker’s ideas and put them in their currently accepted modern form.

Dedekind noticed that each irrational number can be precisely located on the number line by specifying which rational numbers are to its left and which are to its right (because if two irrational numbers are unequal there must be a rational number between them). So, as stand-ins for the irrational numbers, Dedekind used “cuts”, where he defined a cut as a way of cleaving the rational numbers into a left part and a right part, with no overlap between the two parts and with no rational number omitted, such that the left part has no greatest element and the right part has no smallest element. Each such cut singles out a specific hole in the rational number line – a place where we want there to be an irrational number, or something that “quacks” like one. Having defined his cuts, Dedekind had to teach them to quack; that is, he had to define a way to add and multiply cuts, and he had to figure out how to bring rational numbers into his manufactured cut-arithmetic. But once he’d done it, he could say he had constructed the arithmetic of real numbers from the arithmetic of rational numbers.

In short, Dedekind replaced the question “What is an irrational number?” by the question “How can we tell two irrational numbers apart?”, answered it by saying “You can find a rational number that’s less than one of them but not the other,” and then went on to show that all the properties of a specific irrational number could be deduced from just knowing which rational numbers were less than it.

In a similar way, Dedekind replaced “What is a divisor?” by “What distinguishes one divisor from another?” His answer was, the ordinary (non-ideal) numbers it divides. In Kummer and Kronecker’s theory, if 𝖆 and 𝖇 are distinct divisors, there is some complex number that is divisible by 𝖆 but not 𝖇 or vice versa. So Dedekind said “Let’s replace the divisor 𝖆 by the collection of all the complex numbers it divides (inside the number ring under consideration), since that collection contains all the information we need in order to determine which divisor we’re talking about.” And as in his work on cuts, all the properties of 𝖆 could be deduced from just that bare-bones information.

For instance, let’s reconsider the specific ideal divisor 𝖆 we encountered earlier in connection with (2 + sqrt(−5)) × (2 − sqrt(−5)) = 3 × 3. If we plot all the numbers of the form a + b sqrt(−5) that 𝖆 divides in Kummer and Kronecker’s theory, we get the funky picture shown below. (Look at it in a mirror and you get all the numbers of the form a + b sqrt(−5) that 𝖇 divides.)

Dedekind invited algebraic number-theorists to use such collections of complex numbers as concrete stand-ins for the abstract divisors in Kummer’s theory. He called these collections ideals, and in a way the shift from divisors to ideals was more momentous than his invention of Dedekind cuts. Cuts were a single-use contrivance designed to give mathematicians assurance that the theory of real numbers wasn’t built on sand; real numbers could be built on a foundation of rational numbers. But Dedekind never intended that the vocabulary of cuts would change the way the working mathematician would think about, or talk about, irrational numbers; the cuts-concept was an artifice, mere scaffolding to be discarded after the edifice of the real number system had been constructed. In contrast, Dedekind intended his ideals as a workable alternative to Kummer and Kronecker’s divisors.

To see what’s going on with these pictures and the collections of complex numbers that they depict, it’ll be helpful to take a step backward, retreating from complex numbers to the safe familiar world of the integers, and see what ideals look like there.

ADDING MULTIPLES

When my daughter was five, she astounded me one day by announcing “I think there are two Tens. The first Ten is one, two, three, four, five, six, seven, eight, nine, ten; the other Ten is ten, twenty, thirty, and so on.” One Ten faced inward; the other faced outward. In her own way, she was following in Dedekind’s footsteps. Dedekind had, after all, been a pioneer of the theory of counting we nowadays attribute to Giuseppe Peano, which is all about the first kind of Ten; but linking the Ten concept with the set of multiples of ten nods at the way Kummer and Kronecker’s theory of divisors was supplanted by Dedekind’s theory of ideals.

An ideal in the ordinary integers is any collection of integers (let’s call the collection I) with the following two properties:

(1) Every sum of two elements of I is an element of I.

(2) Every multiple of an element of I is an element of I.

One example of such a set I is the set of multiples of 10:

{…, −30, −20, −10, 0, 10, 20, 30, …}

Add any two multiples of ten, and you get a multiple of ten. Multiply a multiple of ten by any integer you like (it doesn’t have to be a multiple of ten), and you get a multiple of ten.

Another ideal in the ring of integers is the set of multiples of fifteen:

{…, −45, 30, −15, 0, 15, 30, 45, …}

There’s nothing special about ten or fifteen here; for each integer m, the set of multiples of m satisfies conditions (1) and (2) and hence is an ideal. We call it the ideal generated by m.

We can multiply ideals in a straightforward way. For instance, to multiply the ideal generated by 10 by the ideal generated by 15, just multiply every multiple of 10 by every multiple of 15 in all (infinitely many) possible ways and gather those products into a new collection. You’ll recognize that what you get is none other than the ideal consisting of all multiples of 150. And this works for any two natural numbers a and b, not just 10 and 15: if you multiply the ideal generated by the number a by the ideal generated by the number b, you get the ideal generated by the number ab.

But if you try to add ideals in an analogous way, something strange happens. If you add every multiple of 10 to every multiple of 15 in all (infinitely many) possible ways and gather those sums into a new collection, you get the ideal consisting of all multiples of 5. What’s going on here is that 5 is the greatest common divisor of 10 and 15. More broadly, if you add the ideal generated by to the ideal generated by b, you get the ideal generated by the greatest common divisor of a and b. So, for instance, returning to 4×9 = 6×6, one way we could “find” the primes 2 and 3 that are hiding under the surface of this equation is by adding the set of multiples of 4 to the set of multiples of 6 (obtaining the set of multiples of 2) and by adding the set of multiples of 6 to the set of multiples of 9 (obtaining the set of multiples of 3).

This way of thinking about the “hidden” prime numbers 2 and 3 sheds new light on the “artificial” prime ideals 𝖆 and 𝖇. After all, one way we can describe 𝖆 is that it divides both 2 + sqrt(−5) and 3. Could it be the “greatest common divisor” of those two numbers? That is, if we add the set of multiples of 2 + sqrt(−5) to the set of multiples of 3, might we get the set of numbers that 𝖆 divides? We have to be careful about the meaning of “multiples” here. We’re no longer limiting ourselves to the ordinary integers, so numbers like 3 × (5 − 6 sqrt(−5)) need to be accepted as multiples of 3, and numbers like (2 + sqrt(−5)) × (5 − 6 sqrt(−5)) need to be accepted as multiples of 2 + sqrt(−5). But with this proviso in mind, if we add every multiple of 2 + sqrt(−5) to every multiple of 3, we get precisely the collection of complex numbers shown in that funky picture in the last section.

This set of complex numbers (call it I) is an ideal in Dedekind’s sense because it satisfies properties (1) and (2): every sum of two elements of I is an element of I, and every multiple of an element of I is an element of I. This ideal gives us everything we need to know about 𝖆. Just as Dedekind modeled irrational numbers as collections of rational numbers satisfying certain properties, he modeled Kummer’s ideal complex numbers as collections of complex numbers satisfying certain properties. Once Dedekind’s rewrite of Kummer’s ideas had taken root, it mostly displaced Kummer and Kronecker’s approach.

AFTERWARDS

We’ve seen that, for all the brilliance of Kummer’s pioneering work on “ideal complex numbers”, new numbers weren’t needed in order to rescue unique factorization in number rings; Dedekind showed that instead of introducing new numbers, one could work with suitable collections of old numbers. But it later emerged that in an even more concrete sense, the numbers Kummer wanted were already present in the complex numbers. More specifically, for every cyclotomic number ring R there’s a larger number ring R′ containing R with the property that every element of R has a unique factorization as a product of primes in R.13 It’s possible to base the theory of divisors in R on the irreducible elements of R′, obviating the need for Dedekind’s approach to saving unique factorization at least as far as FLT is concerned, but it’s just as well that nobody took that approach, since Dedekind’s definition of ideals has proved to be such a pivotal concept in modern algebra.

Dedekind’s way of putting objects together into sets (not just in his theory of ideals but in his theory of cuts as well) points forward to developments in the second half of the nineteenth century. Dedekind’s approach via ideals afforded slick definitions and proofs, but it wasn’t so useful for computing concrete examples. (Consider: You can get very slick proofs in elementary number theory if you define the greatest common divisor of two positive integers a and as the smallest positive integer that can be expressed in the form am+bn, where m and n range over all integers. That definition is very much in the esprit of Dedekind. But if you try to use this definition to compute the greatest common divisor of 4 and 6 in the most stupidly straightforward way, you’ll never finish, because there are infinitely many m and n to try; I mean, just because you’ve tried a bajillion pairs m, n and never expressed 1 as 4m + 6n, how do you know you won’t succeed on your very next try?) In contrast, Kronecker’s theory of divisors gave concrete answers to concrete problems, but wasn’t as easy to work with when one wanted to prove general theorems.

Nowadays, most mathematicians see value in both slick formalisms and efficient algorithms, and reject the idea that one must choose one or the other. Nor is this syncretism an entirely new thing in mathematics. Dedekind and Kronecker, as far as I am aware, had great respect for one another. They had different modes of thinking, but each recognized what the other was doing as a valid style of mathematics.

As we’ll see a few months from now, there was another nineteenth century mathematician to whom Kronecker did not accord the same respect that he did to Dedekind. This mathematician really ran with Dedekind’s idea of forming sets of mathematical objects and treating those sets as new mathematical objects in their own right, and he ran so far with it that Kronecker thought he’d run beyond the bounds of legitimate mathematics entirely. Kronecker reviled this younger colleague as a “corrupter of youth” and sought to derail his career in every way that came within his power. But that’s another story for another essay.

I’ll conclude by saying why (I think) “ideal complex numbers” never became A Thing, or at least aren’t one today. To start with, Kummer’s ideas don’t have much to say about complex numbers in general; his work was situated squarely in algebraic number theory, and its applicability is limited to algebraic numbers, that is, numbers defined by algebraic equations. This leaves out “most” complex numbers, such as the transcendental numbers e and π. So at most Kummer devised a theory of ideal algebraic numbers.

But more importantly, Kummer never gave a way to add ideal complex numbers that would extend the way we add complex numbers or even ordinary integers. Multiply them, yes; add them, no. Dedekind’s rewrite didn’t change this situation, and with a little thought you can see why: the multiples of 2 are the same as the multiples of −2, so you should really think of this ideal as being generated by the “sign-ambiguous number” ±2. The same goes for the ideal generated by m; you can think of it as being associated with the sign-ambiguous number ±m. Now, you can multiply sign-ambiguous numbers in a sensible way: ±6 times ±4 equals ±24. But if you try to add sign-ambiguous numbers in a way that reflects how ordinary addition works, chaos ensues: +6 plus +4 is +10, but +6 plus −4 is +2. So should ±6 plus ±4 be equal to ±10, or ±2, or something else?

When we move from the ordinary complex numbers to Kummer’s “idealized complex numbers”, we lose the ability to perform integer addition. That debility stretches our leniency about the meaning of “number” to the breaking point. It violates the Principle of Permanence that had governed earlier extensions of the word “number”.

Dedekind’s ideals aren’t numbers. As we’ll see, they’re something better.

This essay is a draft of chapter 12 of a book I’m writing, tentatively called “What Can Numbers Be?: The Further, Stranger Adventures of Plus and Times”. If you think this sounds cool and want to help me make the book better, check out http://jamespropp.org/readers.pdf. And as always, feel free to submit comments on this essay at the Mathematical Enchantments WordPress site!

ENDNOTES

#1. In Fermat’s original version of the assertion ab, and c were rational numbers, not integers, but it’s not hard to show that if either version is true then the other holds as well. One direction is easy: every integer solution is a rational solution. The other direction is almost as easy: every rational solution becomes an integer solution if you multiply ab, and c by their lowest common denominator.

#2. For instance, suppose you knew that a3 + b3 = c3 had no solutions with a, b, c all nonzero integers. Then there couldn’t be a nonzero integer solution to a6 + b6 = c6 either, because if a, b, c is a counterexample to FLT with n = 6 then a2, b2, c2 is a counterexample to FLT with n = 3. This reasoning applies not just to 3 and 6 but to any odd prime p and any multiple of that prime. That takes care of FLT for all exponents n except powers of 2 (the only numbers without an odd prime factor). But once we know that FLT is true for n = 4 as well, the same style of argument takes care of all multiples of 4, which includes all the powers of 2 beyond 4, and the proof is done – if only we can prove FLT for all odd primes.

#3. Euler gave a second proof of the case n = 3 and the second proof was valid, so Euler is rightfully given credit for the result; see the MacTutor article on Fermat’s Last Theorem listed in the References.

#4. One way to think about the factorization is to divide both sides by bn and to write a/b as x; then the equation becomes xn +1 = (x + ζ1) (x + ζ2) ··· (x + ζn). Both sides are polynomials of degree n with leading coefficient 1; to show that equality holds, it’s enough to show that they have the same complex roots. The left hand side equals 0 whenever x is an nth root of −1; the right hand side equals 0 whenever −x is an nth root of 1 (for instance, if x + ζ1 = 0, then −x = ζ1 and (−x)n = (ζ1)n = 1). But these two conditions on x are equivalent since we’ve assumed that n is odd: (−x)n = ((−1)(x))n= (−1)n (x)n = −xn.

#5. The name arises because if you plot the complex nth roots of 1 in the plane they lie on the circle of radius 1 around the origin and they cut that circle into n equal pieces. (“Cyclotomic” means “circle-cutting”.)

#6. Smart alecks will object that primes can’t be written as products of primes. Modern mathematicians, who like Kummer esteem the Fundamental Theorem of Arithmetic so highly that they’ll redefine words and concepts in order to save it, have countered that objection by broadening the notion of what “product” means, so that a single prime is a product with just one factor, namely that prime itself. The smart alecks may then contend that surely the number 1 can’t be written as a product of primes. Mathematicians have an answer for that too; they’ve broadened the notion of “product” even further, so that the number 1 is a product with no factors at all! Mathematicians didn’t adopt this convention about the “empty product” merely to make the Fundamental Theorem of Arithmetic easier to state without fussy caveats and exemptions; it makes lots of other theorems simpler too. But it takes some getting used to, and if you find it disconcerting, you’re in good company; even set theory pioneer Richard Dedekind, whom we’ll meet later in this essay, felt uncomfortable with the empty set. See the Kanamori article listed in the References.

#7. To see what can go wrong if you have too many numbers in a number ring, suppose for instance we’re in a number ring that contains the square root of 5, the 4th root of 5, the 8th root of 5, and so on. In this number ring 5 factors as the square root of 5 times the square root of 5, but each of the factors in turn can be broken down as the fourth root of 5 times the fourth root of 5, and each of those factors can be broken down into even smaller factors, and so on, ad finitum; this unending cascade undermine the whole idea of breaking things down into indivisible “prime” building blocks.

#8. I’ve chosen this example for simplicity, but at a cost: this number ring isn’t one of the cyclotomic number rings Lamé used in his attempted proof of FLT. But Richard Dedekind, whose entrance into today’s story is approaching, liked to trot out this example when he was explaining how the Fundamental Theorem of Arithmetic can fail in number rings, so it’s good enough for me.

#9. They were born just two months apart, which makes me wonder: was the name Gustav especially popular that winter? By the way, Dirichlet’s proper surname is Lejeune Dirichlet, but most people nowadays drop the “Lejeune”.

#10. Note that Kummer did not call them “illegitimate numbers”, “spurious numbers”, “bogus numbers”, or anything of a similarly disparaging nature; he seems to have learned the lesson of past centuries, in which mathematicians adopted deprecatory terminology that didn’t age well and that their successors got stuck with.

#11. I’m distorting the math here a bit. Kummer wouldn’t have said 2 + sqrt(−5) was equal to 𝖆2; rather, he would have said it was twice-divisible by 𝖆. But in the interest of intelligibility I’ll persist in this distortion.

#12. If you replace xy, and z by cb, and a respectively and rearrange terms, you can turn this into the version of the Fermat equation I gave earlier.

#13. You might hope that all the numbers in R′ factor uniquely as products of primes of R′, but no such luck; the numbers in R′ only factor uniquely if you use primes that belong to an even bigger number ring R′′, and so on, ad infinitum.

REFERENCES

Jeremy Avigad (translator), Dedekind’s 1871 version of the theory of ideals. See especially the translator’s introduction.

Keith Conrad, Fermat’s Last Theorem for Regular Primes.

Harold M. Edwards, The Background of Kummer’s Proof of Fermat’s Last Theorem For Regular Primes. Archive for History of Exact Sciences , 5.XII.1975, Vol. 14, No. 3 (5.XII.1975), pp. 219-236.

Jeremy Gray, Mathematicians as Philosophers of Mathematics: Part I. For the Learning of Mathematics, vol. 18, no. 3 (1998). See pages 21–24.

Akihiro Kanamori, The Empty Set, the Singleton, and the Ordered Pair. The Bulletin of Symbolic Logic, Vol. 9, No. 3 (Sep., 2003), pp. 273-298.

Ernst Kummer, Extrait d’une lettre de M. Kummer à M. Liouville. Journal de mathématiques pures et appliquées, 1re série, tome 12 (1847), p. 136.

MacTutor, Fermat’s Last Theorem.

2 thoughts on “When Five Isn’t Prime

  1. Pingback: Marvelous Arithmetics of Distance |

  2. Pingback: Plus and Times Set Free |

Leave a comment