# What Lovelace Did: From Bombelli to Bernoulli to Babbage

I want to tell you about difference tables for polynomials, not only because they’re fun but also because they’ll give us a chance to see how polynomials played a role in the dawn of the computer age through the work of computer pioneers Charles Babbage and Ada Lovelace.

But first, where did polynomials come from?

THE ART OF THE THING

“Thing” is a marvelously flexible word, as are similar words like “res” and “cosa” that other languages have used to signify unspecified objects. Often the word denotes a group of people who have come together for some purpose: think of the Roman Republic (the “public thing”) or the Cosa Nostra (“Our Thing”). Curiously, the English word “thing” itself seems to have traveled in the opposite direction, starting out as meaning an assembly of people and ending up as meaning, well, any-thing. Math has made its own uses of nonmathematical words for indefinite objects: in Indian and Arabic algebra, the quantity being sought was often called “the thing”. It was natural for European algebraists to borrow this usage, and indeed Renaissance algebra was sometimes referred to as “the art of the thing”. (See Endnote #1.)

Of course polynomials predate Europe; mathematicians around the world used polynomials two thousand years ago, back in the days when math was made of words and algebra was rhetorical (see Endnote #2). But when you think of polynomials you probably think of the way they’re written in modern high schools, which means you’re thinking of the modern, symbolic approach to polynomials pioneered by Rafael Bombelli. I wrote about Bombelli in my essay on complex numbers. His 1572 book L’Algebra popularized a version of exponential notation similar to the one we use today and he gave clear rules for how to perform arithmetic operations on polynomials. For instance, here’s an approximate translation of Bombelli’s description of how we can multiply simple polynomials like 3x4 and 5x6 (where he refers to powers as “dignities” and exponents as “abbreviatures”):

When one has to multiply dignities one adds the numbers of the abbreviatures written above, and from those will be formed an abbreviature of dignities, and the numbers that stand below that dignity are simply multiplied.

That is, to multiply 3x4 by 5x6 we multiply the “numbers below” (the 3 and the 5) to get 15, and we add the “abbreviatures” (the 4 and the 6) to get 10, obtaining 15x10 as the answer.

Polynomial notation gave algebra a new economy of expression, but more important was the new viewpoint that polynomials brought in, relating solving equations to factoring polynomials. Consider the equation x2 + 2 = 3x; it’s not hard to check that x = 1 and x = 2 are solutions, but how can we be sure that there aren’t more? On the other hand, consider the equation (x−1)(x−2) = 0. It’s algebraically equivalent to x2 + 2 = 3x (to see why, expand the left hand side of (x−1)(x−2) = 0 and do some rearranging), but it’s more eloquent in explaining to us why there are no solutions we’ve overlooked. For, if x is equal to neither 1 nor 2, then x−1 can’t be 0 (because x isn’t 1), and x−2 can’t be 0 (because x isn’t 2), so neither x−1 nor x−2 can equal 0; but then (x−1)(x−2), being the product of two nonzero numbers, can’t be 0. In modern algebra, the fact that the product of two nonzero numbers cannot be zero gets promoted from boring truism to organizing principle. In particular, it implies that a polynomial equation of degree n has at most n solutions.

The word “polynomial” goes back to the writings of René Descartes, who cobbled the word together using the Greek prefix “poly” meaning “many” and the Latin root “nomen” meaning “name”. Here “poly” refers to the fact that a polynomial can be a sum of many terms that feature x raised to various powers. When there’s just one term, as in the polynomial x4, we refer to the polynomial as a monomial (even though “mononomial” would be more apt). When there are two terms, as in the polynomial x4 + x3, we refer to the polynomial as a binomial. (See Endnote #3.)

MAKING A DIFFERENCE

Polynomials have many important applications in the sciences, but the following party trick is not one of them: Have a friend pick four counting numbers a, b, c, and n but not reveal them to you; instead, they are to reveal to you the values of the polynomial ax2 + bx + c evaluated at x = n, x = n+1, and x = n+2. You amaze them, at least in theory, by telling them, after a quick bit of pencil-and-paper calculation, the value of the polynomial ax2 + bx + c evaluated at x = n+3, n+4, and n+5, the next three numbers in turn — even though you don’t know what a, b, c, or n are!

For instance, say they picked a, b, c, and n to all equal 1 (but you don’t know that), so they reveal to you the values of the polynomial x2 + x + 1 at x = 1, 2, and 3; that is, they reveal to you the numbers 12 + 1 + 1 = 3, 22 +2+1 = 7, and 32 +3+1 = 13. You would create a table like this:

The first row of this table consists of the three numbers your friend revealed (call them r1r2, and r3), with spaces reserved for the three numbers you’re going to eventually announce back. The second row gives the differences r2 − r1 and r3 − r2, while the third row gives the difference-of-differences (r3 − r2) − (r2 − r1). (Check: 7 − 3 = 4, 13 − 7 = 6, and 6 − 4 = 2.) We say that the second row is the difference sequence of the first row (and likewise the third row is the difference sequence of the second row).

Now extend that bottom row by repeating its sole entry three more times:

Next, fill in the second row of the table in such a way that the third row is the difference-sequence of the second row:

(Check: 8−6 = 2, 10−8 = 2, and 12−10 = 2. Of course I didn’t just guess the numbers 8, 10, and 12 at random and happen to be lucky; I calculated them successively as 6 + 2 = 8, 8 + 2 = 10, and 10 + 2 = 12.)

Finally, fill in the first row of the table in such a way that the second row is the difference-sequence of the first row:

(13+8 = 21, 21+10 = 31, and 31+12 = 43.) You announce the numbers 21, 31, and 43, and your friend checks that these are indeed the values of 42 +4+1, 52 +5+1, and 62 +6+1.

And now, if you really want to be impressive, you can race your friend to compute the next five terms. You’ll probably win by just extending your table out five more places, because your arithmetic will be simpler than theirs! It’s possible that you and your friend will get different answers for one of the numbers. Then you can pull out your phone, open the calculator app, and see who’s right and who’s wrong. My guess is that your friend is wrong, because the procedure they’re following involves more steps.

Why does the trick work? Look back at the table. Notice that the second row is an arithmetic progression, with each term increasing by the same amount (namely 2) as one progresses from left to right. I show in Endnote #4 that if you write the successive values taken on by a quadratic polynomial and compute the successive differences, you always get an arithmetic progression. That is, if you list successive values taken on by a polynomial function of degree 2, and you take the differences, those differences will be the successive values taken on by a polynomial function of degree 1. And this trick isn’t limited to polynomials of degree 2; see Endnote #5.

We call the tables we generate in this way difference tables. This game of operating on polynomials to get new polynomials of lower degree (deriving a row of the table from the row above) and reversing the process (deriving a row of the table from the row below) is called the calculus of finite differences, not to be confused with the infinitesimal calculus of Isaac Newton and Gottfried Leibniz. (Incidentally, even before Leibniz and Newton were born, Indian mathematicians in Kerala applied the calculus of finite differences to the sine and cosine functions and obtained the power series expansions of these functions! I’ll tell you this story soon.)

I propose a puzzle you might wish to think about using the ideas of this section: Show that if p(·) is a polynomial of degree d and if p(1), p(2), . . . , and p(d+1) are all integers, then p(n) is an integer for all integers n. (An example of such a polynomial is my “Personal Polynomial” (5/2)x2 − (17/2)x + 16 which I wrote about last month.) For a solution, see Endnote #6.

PATTERNS IN THE POWERS

I have no talent for grave-robbing, nor am I consumed by curiosity about the respective roles played by heredity and environment in determining a person’s mathematical ability; but were I so talented, and so consumed, I’d consider digging up the bones of the Bernoullis to scrape together some of their DNA. These eight mathematicians constituted a kind of European mathematical nobility for nearly a century. If there’s anything like a math gene, one might suppose that the Bernoullis had it. (Do historians of science know anything about the women of that family? Given that they shared the heredity and environment of the men, one would expect that they too would have displayed mathematical talent even if they were never given a chance to develop it.)

The Bernoulli numbers — the sequence of numbers 1, 1/2, 1/6, 0, −1/30, 0, 1/42, 0, −1/30, 0, 5/66, … — were named after a member of the Bernoulli family, but he didn’t discover them. Priority belongs to the German weaver-surveyor-mathematician Johann Faulhaber (1580-1635) who had investigated them a hundred years earlier.

The roots of Faulhaber’s work were ancient. It had been known for millennia that 1 + 2 + 3 + … + n is equal to (1/2)n2 + (1/2)n for all n, and for nearly as long that 12 + 22 + 32 + … + n2 is equal to (1/3)n3 + (1/2)n2 + (1/6)n for all n and that 13 + 23 + 33 + … + n3 is equal to (1/4)n4 + (1/2)n3 + (1/4)n2 for all n. Notice the degrees of those polynomials: 2, 3, and 4. It’s natural to guess that the pattern continues, and that for all positive integers k, the sum 1k + 2k + 3k + … + nk is given by some k+1st-degree polynomial function of n. (See Endnote #7.) The question is, which k+1st-degree polynomial function of n does the job? There are infinitely many to choose from.

Faulhaber found a formula in which certain mysterious multipliers played a role. These multipliers are what we now call the Bernoulli numbers. Faulhaber computed the first dozen or so of them and then stopped; he had demonstrated a way to find them, albeit an arduous one, and that was progress enough for him.

Faulhaber’s work was forgotten for a century. Then two mathematicians working completely independently rediscovered what Faulhaber had known: the Japanese mathematician Seki Takakazu (1642-1708) and the Swiss mathematician Jacob Bernoulli (1654-1705). Neither mathematician published his results while he was still alive; Takakazu’s result was published in 1712, while Bernoulli’s was published in 1713 as Summae Potestatum.

Bernoulli’s elation at his discovery is evident from his words, and his joy is laced with a bit of schadenfreude:

I have found in less than a quarter of an hour that the tenth powers of the first thousand numbers beginning from 1 added together equal 91,409,924,241,424,243,424,241,924,242,500, from which it is apparent how useless should be judged the works of Ismael Bullialdus, recorded in the thick volume of his Arithmeticae Infinitorum, where all he accomplishes is to show that with immense labor he can sum the first six powers — part of what we have done in a single page.

Bernoulli’s analysis laid more emphasis on the numbers he called A, B, C, D, etc. than Takakazu’s did, and Bernoulli was part of the European mainstream, so it’s natural that Leonhard Euler named these numbers Bernoulli numbers and not Takakazu numbers.

STEAM DREAMS

The English mathematician Charles Babbage had a dream.

It started in 1821, when young Charles and his friend John Herschel, charter members of the newly founded London Astronomical Society, were bemoaning the unreliability of published astronomical tables. Existing tables often had errors, whether caused by the people who computed the numbers (called “computers” in those days) or by the typesetters who recorded the numbers in print. Meanwhile, elsewhere in England, steam was doing amazing things to amplify the powers of the human body. As Babbage would later write:

Mr. Herschel . . . brought with him the calculations of the computers, and we commenced the tedious process of verification. After a time many discrepancies occurred, and at one point these discordances were so numerous that I exclaimed, “I wish to God these calculations had been executed by steam,” to which Herschel replied, “It is quite possible.”

Mechanical adding machines already existed; the philosopher-mathematician Blaise Pascal had himself invented one in 1642. Babbage realized by taking the components of such machines and hooking them together in new ways, he could create machines that could automatically tabulate the values of polynomial functions.

Let’s see how a small Difference Engine (for that is what Babbage called his invention) could have tabulated the values of the polynomial n2 + n + 1 that we met at a party earlier in this essay. Picture a machine with three numerical registers that evolve over time, with each number represented by the states of various gears as in a Pascal adding machine. At the beginning of the machine’s performance of its task the registers show the three numbers

(never mind why those specific numbers, at least for now). Then the device adds the 4 in the second register to the 3 in the first register and adds the 2 in the third register to the 4 in the second register, leaving the 2 in the third register alone:

(Here the straight and slanted vertical lines indicate how the new values of the registers are sums of the old values: 7 is 3+4, 6 is 4+2, and 2 is just 2.) Then the device adds the 6 in the second register to the 7 in the first register and adds the 2 in the third register to the 6 in the second register, once again leaving the 2 in the third register alone:

Then it executes the same procedure again:

And so on.

Do these numbers look familiar? They should! Babbage’s machine is constructing the difference table

one diagonal at a time. If we wanted a table of values of the polynomial n2 + n + 1, all we’d have to do is connect the Difference Engine to a suitable printer and have it print out the numbers that successively occur in the first register.

Earlier I wrote: “Polynomials have many important applications in the sciences, but the following party trick is not one of them.” That’s true enough. But we also saw (in the final part of the trick) that the paper-and-pencil technology of difference tables gave you a better way to compute values of your friend’s polynomial than your friend’s straightforward way. Also, we just saw how Babbage realized that you and your hand could be replaced by infallible (or at least less-fallible) gears, and that the more-accurate tables produced by such mechanical processes would have scientific (and industrial and military) uses. So yes, it’s a party trick. But it’s a party trick with a direct thematic link to Babbage’s Difference Engine. And as we’ll soon see, Babbage didn’t stop there.

Annabella Milbanke was not exactly George Gordon’s “type”, and he was certainly not hers. The latter circumstance was part of her appeal; her initial indifference intrigued and attracted him. (Or so at least I surmise; I wasn’t there.) But let’s give Annabella and George their titles: Annabella Milbanke Baroness Wentworth and George Gordon Lord Byron. Yes, that Byron. The memorable description of the iconic poet and scoundrel as “mad, bad, and dangerous to know” was coined not by his enemies but by his more-than-friend Lady Caroline Lamb. Milbanke would be known in her later years as a champion of progressive causes, such as vaccination. Byron admired the intelligent young Annabella and in a letter to her aunt dubbed the bookish heiress a “Princess of Parallelograms”. Annabella rejected his first proposal of marriage but unwisely accepted the second. When a libertine and a moralist fall in love in a romantic comedy, each has a moderating influence on the other, but in real life, clashing natural proclivities can lead to ever-greater polarization, and such I think was the case with Annabella and George. After only a few years their marriage ended (“I heard she moved out! She gave up on the marriage!” “Well, I heard she only moved out after he brought one of his lovers in!”), but before that they had a baby together, a child who was mere months old when her parents separated.

Young Ada showed an even greater aptitude for mathematics than her mother had and Annabella encouraged the girl, hoping that the pursuit of mathematics, and more broadly the cultivation of her faculties of reason, would protect her from the genetic taint of her father’s unbalanced temperament (or, some might say, insanity). The end result of Annabella’s efforts was a daughter who pursued what Ada called “poetical science”, eventually earning her (from an admiring Babbage) a nickname that surpassed the one Byron had given her mother: “Enchantress of Number”.

Ada met Babbage in 1833 at a party at Babbage’s house. The teenager must have made a good impression on the quadragenarian, because he invited her and her mother to attend a demonstration of his newly constructed Difference Engine (a prototype of a larger version he hoped to build): a two-foot-tall machine with 2000 brass parts that was powered by a hand crank and could have printed out mathematical tables if only Babbage had completed the envisioned printer that went with it.

Babbage’s plans for his Difference Engine ran aground on the shoals of several problems. One was that machining parts to the required precision was a more arduous and expensive proposition than he had realized when he first proposed his scheme to the British government as a way to automate the production of mathematical tables. Someone with more business sense or political savvy might have been able to handle these delays and overruns, but the unworldly Babbage wasn’t up to the task. Besides, he was distracted by a vision of an even greater machine that could solve more interesting problems than computing values of polynomials. He called the envisioned device his Analytical Engine.

In 1840, Babbage, having failed to interest the British government in supporting his work, visited Italy to give a talk about the Analytical Engine and thereby drum up enthusiasm and funding overseas. His talk inspired the Italian mathematician Luigi Menabrea to publish a description of the engine in French. Ada (now married to William King-Noel, First Earl of Lovelace) took on the project of translating Menabrea’s article into English, and decided to enhance the article with Notes of her own (notes so comprehensive that they eventually dwarfed what Menabrea had written). In her notes, Lovelace limns a future in which mechanical devices will be able to do such things as compose music, if only we tell them quite precisely the mathematical rules governing musical composition. But “Sketch of the Analytical Engine Invented by Charles Babbage, with Notes by the Translator” isn’t just a prophecy of a coming age of machine intelligence; it also contains what some have called the first true computer program. And that program computes Bernoulli numbers.

Computing Bernoulli numbers is a good deal more complicated than computing the values of polynomials; see Target’s article if you want to know more. One reason Lovelace chose this computing challenge was that it showcased a feature of the Analytical Engine that the Difference Engine had lacked: the ability of a computation to enter a loop, or as Babbage put it, to “eat its own tail”. Nowadays we recognize the momentous significance of the difference between the two designs: in principle, a machine like the Analytical Engine had crossed over into the domain of universal computation.

Whether or not one chooses to regard Lovelace’s program as the first computer program ever written, it was certainly the most complicated set of instructions for a mechanical computation that had ever been described up till then. Computer science bloggers Jim Randall and Sinclair Target and Stephen Wolfram noticed that at one point in her program there’s a mistake: the numerator and denominator of a fraction have been swapped. The first programmer was also the creator of the first programming bug! But I don’t think the bug does her any discredit; it points to the magnitude of her ambition and the complexity of the task she had chosen to undertake. As Target asks, if you aren’t writing bugs, are you writing real programs?

Alas, Babbage’s dreams were bigger than his budget and his managerial capabilities. The Engine was never built. Ada died of cancer in 1852 at the age of 37, and Charles died in 1871, a bitter and disappointed old man. The Victorian Computer Age never dawned (though authors of steampunk fiction keep wondering “What if …?”).

If the Analytical Engine had been built in her lifetime, I have no doubt that Lovelace would have found the bug in her program. And while we’re talking about mistakes, did any of you notice that the page from Bernoulli’s “Summae Potestatum” that I showed you earlier contains an error? That last term in the second-to-last polynomial in his table should be −3/20 nn, not −1/12 nn. Bernoulli was a creative human being, not a mathematical engine. (Though I have no doubt that if you’d told Bernoulli he’d made a mistake in his table, he would’ve found it in a lot less than fifteen minutes.)

The history of math shows time and time again that the giants of mathematics aren’t flawless paragons of reason who never err; they’re humans who discover new vistas, explore them, have creative ideas (some of which work), inevitably stumble as they traverse landscapes never seen before, recover from their stumbles, and move on. Indeed, the creative faculty of human beings — not to be found in the Analytical Engine, which Lovelace famously wrote can only do “whatever we know how to order it to perform” — may share roots with the human propensity for error. The Latin root for “error” is the word for wandering. If we don’t wander off beaten paths, how will we know what vistas we’re missing?

Thanks to Sandi Gubin, Eliana Propp-Gubin, and Stephen Wolfram.

REFERENCES

Sarah Baldwin, Ada Lovelace and the Analytical Engine.

Peter Cameron, Polynomials taking integer values.

A. W. F. Edwards, “Sums of powers of integers: a little of the history”, The Mathematical Gazette, Vol. 66, No. 435 (Mar., 1982), pp. 22-28.

Martin Gardner, “The Calculus of Finite Differences”, chapter 20 in “New Mathematical Diversions”.

Silvio Maracchia, The importance of symbolism in the development of algebra, Lettera Matematica volume 1, pages 137-144 (2013).

Burkard Polster, “Power sum MASTER CLASS: How to sum quadrillions of powers … by hand! (Euler-Maclaurin formula)”: Mathologer channel, https://youtu.be/fw1kRz83Fj0 .

Duana Saskia, Discovering Ada’s Bernoulli Numbers, Part 1. (Alas, there seems to be no Part 2!)

Sinclair Target, What Did Ada Lovelace’s Program Actually Do?

Stephen Wolfram, Untangling the Tale of Ada Lovelace.

ENDNOTES

#1. Algebraists were sometimes called “cossists”, which I suppose could be translated as “thingologists”.

#2. The wordy kind of number-talk that people used in the old days really is called “rhetorical algebra”. I’m not kidding. Maybe high school algebra teachers today should spice things up in the classroom by using a broader range of classic rhetorical devices: accismus (“Whatever you do, don’t subtract x from both sides!”), adynata (“You’ll find a solution to x = x+1 when cows do calculus”), antimeria (“Let’s see if our old friend Mister Quadratic Formula can help us out here”), etc.

#3. The word “binomial” occurs in a famed assemblage of late-19th-century English cultural trivia called “The Major General’s Song”, from Gilbert and Sullivan’s operetta The Pirates of Penzance. In the song, career soldier Major General Stanley, while conceding his complete lack of military knowledge, boasts:

I’m very well acquainted, too, with matters mathematical;/ I understand equations, both the simple and quadratical./ About binomial theorem I’m teeming with a lot o’ news / With many cheerful facts about the square of the hypotenuse.

Likewise, in Conan Doyle’s story “The Final Problem”, we are meant to infer that Sherlock Holmes’ nemesis Moriarty is Holmes’ intellectual equal when Holmes tells Watson:

“He [Moriarty] is a man of good birth and excellent education, endowed by nature with a phenomenal mathematical faculty. At the age of twenty-one he wrote a treatise upon the Binomial Theorem, which has had a European vogue. On the strength of it he won the Mathematical Chair at one of our smaller universities, and had, to all appearances, a most brilliant career before him.”

Of course the binomial theorem of Newton was old news by the time Holmes came on the scene. But I like to imagine that Holmes was thinking of the q-binomial theorem, which (as part of the broader subject of q-series) was a hot topic on the Continent in the 19th century.

#4. Write the polynomial a x2 + b x + c as p(x) for short, and define q(x) as p(x+1) − p(x), so that the numbers in the second row of the table are q(n), q(n+1), . . . Expanding the expression p(x+1) − p(x) and regrouping, we get

q(x) = (a (x+1)2 + b (x+1) + c) − (a (x)2 + b (x) + c) = a ((x+1)2x2) + b ((x+1)−x) + (cc) = a·(2x+1) + b·(1) = (2a)x + (a+b) ,

a linear function of x. So the numbers in the second row form an arithmetic progression.

#5. If your friend gives you d+1 successive values of some polynomial of degree d, you can find successive terms using a trapezoidal array with d+1 rows instead of just three. That’s because if p(x) is a polynomial of degree d, the polynomial p(x+1) − p(x) simplifies to give a polynomial of degree d−1. So if the top row of your table gives d+1 successive values of a polynomial of degree d, the second row will give d successive values of a polynomial of degree d−1; likewise the next row will give d−1 successive values of a polynomial of degree d−2; and so on, with the dth row giving 2 successive values of a polynomial of degree 1 and the final (d+1st) row giving the value of a polynomial of degree 0. And a polynomial of degree 0 is just a constant function of x; once you know its value at x = n, you know its value at x = n+1, n+2, etc.! So you can fill in the last row, which lets you fill in the second-to-last row, and so on, all the way back up to the top. So you can fill in that top row and announce those numbers while your friend is still squaring and swearing.

#6. Since the top row consists of integers, the whole triangle beneath it consists entirely of integers as well (since the difference of two integers is always an integer). But the bottom row represents a constant polynomial, so the bottom row extends to give infinitely many repetitions of that integer. Now fill in the difference table going upward; the sum of two integers is always an integer, so you’ll never see any non-integer values anywhere in the extended table, including in its top row. So this tells us that p(d+2), p(d+3), etc. are all integers as well. Technically this argument only handles p(n) as n goes to positive infinity, not as n goes to negative infinity, but by extending the difference table to the left as well as the right we can take care of this case too. I learned of this pretty application of difference tables from Peter Cameron’s blog (see the References).

#7. If you take the sequence whose nth term is 1k + 2k + 3k + … + nk and form its difference sequence, you’ll just get the sequence of kth powers whose terms are of course given by a polynomial of degree k. You might say that the sequence 1k, 1k+2k, 1k+2k+3k, … is the “anti-difference” of the sequence 1k, 2k, 3k, … Since taking the difference sequence of a polynomial decreases its degree by 1, it makes sense that taking the anti-difference increases the degree by 1. And this fact from the calculus of finite differences foreshadows something similar that happens in the infinitesimal calculus, where differentiating a polynomial reduces its degree by 1 and anti-differentiating it increases its degree by 1. This is just one of many profound similarities between the calculus of finite differences and the differential calculus.