Saturday, March 2nd, 2019 · 15 min read

Surprisingly, one of the most interesting proofs I did in my first-year real analysis course.

Every time we create something new, we go from 0 to 1. The act of creation is singular, as is the moment of creation, and the result is something fresh and strange.

– Peter Thiel,

Zero to One

A 1992 study published in *Nature* worked with five-month-old infants to determine their capacity for understanding addition and subtraction. Experimenters showed babies an object, hid it behind a screen, and then had the babies watch as they added an extra object behind the screen. During some trials, the experimenters would surreptitiously remove the extra object. Even at that age, the babies knew that something was wrong when they saw “zero more” objects added to the group instead of “one more” object.

For the most part, this is the innate intuition that carried us through our early math classes. If we were lucky (or unlucky, depending on who you ask), we got our first taste of formalizing this intuition in middle- or high-school geometry. Starting with propositions called “axioms” — things we took for granted as true — we were forced to consider how our intuition stemmed from these axioms, and constructed formal, albeit basic, mathematical “proofs” for results like the Law of Cosines or the congruence of two triangles.

If you forgot it, the Law of Cosines says that $c^2 = a^2 + b^2 - 2ab\cos(C)$, where $a$, $b$, and $c$ are side lengths of a triangle and $C$ is the angle opposite of side $c$. If you plug in 90 degrees for $C$, you get the Pythagorean theorem.

In that first geometry class, we were told what we could assume to be true — but did we ever stop to ask why?

Who decided what exactly we could take for granted? Why these specific axioms? Why couldn’t we assume the Law of Cosines was true, and why did we have to prove it?

Mathematicians have thought long and hard about these questions, and the community consensus is not necessarily on specific axioms that we take for granted as true, but on a principle: keep the number of assumptions to a minimum. This is similar to a famous problem-solving technique known as Occam’s razor: “When presented with competing hypotheses to solve a problem, one should select the solution with the fewest assumptions.”

The problem of coming with a minimal set of axioms from which all mathematics follows is harder than it looks. Mathematicians have labored for years to do so, and the most famous attempt was the *Principia Mathematica*, published in 1913 by mathematicians Alfred North Whitehead and Bertrand Russell. In 1931, however, logician Kurt Gödel proved that any such system was impossible — in short, any choice of axioms would either be incomplete, and unable to prove all of mathematics; or inconsistent, and could be used to prove contradictions.

Nonetheless, mathematics has to start from somewhere, and so mathematicians have defined specific axioms for the specializations they work in, like geometry (think Euclid’s axioms). These specialized axioms are what geometrists, algebraists, and so forth have decided are the minimal set of assumptions they need to do productive work and draw valid conclusions.

It is through these axioms that we can rigourously show that 1 is in fact greater than 0 — not from nebulous notions like “intuition,” but from solid mathematical footing built on the axiomatic consensus of the mathematical community.

Indeed, perhaps this is what differentiates our mental capacity from those of five-month-olds.

As a sidenote, bucking convention and exploring the consequences of alternative axioms have led to the creation of whole new branches of mathematics. One example is spherical geometry, which throws traditional Euclidean foundations out the window. On a sphere, for example, the angles of a triangle can add up to more than 180 degrees.

“God made the natural numbers; all else is the work of man.”

– Leopold Kronecker, German mathematician

When I say “minimal set of assumptions,” there are many different levels of “minimal” we can start at. Our foundational level of abstraction could potentially be that all we have to work with are the natural numbers — $1, 2, 3, ...$ — as Kronecker seems to be advocating for. Alternatively, we can simply take $1 > 0$ to be an axiom.

We could go in a few directions with the first approach. There are the Peano axioms, which are a set of axioms on the natural numbers that aim to fully describe their behavior. These axioms are almost like Newton’s Laws — not constructed, but rather, a *description* of the “natural” properties of the natural numbers. In this approach, we simply *define* the order of the natural numbers, so we conclude $1 > 0$ by construction.

We define the ordering of the natural numbers as: for natural numbers $a$ and $b$, $a \leq b$ if and only if $a + c = b$ for some natural number $c$.

It’s valid, but to a certain extent it seems like a bit of a cheap shot — we’re essentially defining our result into existence.

On the other hand, we could try to prove $1 > 0$ in the real numbers. However, starting from the fundamentals in this direction is almost “too close to the hardware,” and to go from the naturals ($1, 2, 3$, etc.) to the reals (e.g. $\sqrt{2}, \pi, 3$) necessitates the use of such concepts as Cauchy sequences, equivalence classes, and more — tools that require a thorough background in modern algebra (which unfortunately, I lack).

To take the last approach, axiomatizing our conclusion that $1 > 0$ into truth, would be akin to eating dessert before dinner.

The approach that I found most enlightening — accessible yet satisfyingly rigorous — was presented in my introductory analysis class at the University of Michigan by Professor Stephen DeBacker. We’ll start at a level of abstraction that’s readily understandable — yet sufficiently logically separated from our result — so we’ll still be able to see firsthand how our basic assumptions can be used to formalize the seemingly simple conclusion we’re going for. Furthermore, our basic assumptions will be the same assumptions used by specialists in the fields of modern algebra and real analysis — so I would say we’re justified in choosing this place as a starting point.

Our “minimal assumption” is that the real numbers satisfy the below properties, where $a$, $b$, and $c$ are arbitrary real numbers. The term commonly used by the mathematical community to refer to each property is listed in parentheses next to each one.

- $a + b$ is a real number (i.e. adding two real numbers results in another real number, also known as “closure under addition”)
- $a \times b$ is a real number (“closure under multiplication”)
- $a + b = b + a$ (i.e. we can switch the order of addends, known as “commutativity of addition”)
- $(a + b) + c = a + (b + c)$ (i.e. we can add in any order, known as “associativity of addition”)
- There exists a real number $0$ such that $a + 0 = a$ ($0$ is an “additive identity element”)
- There exists a real number $x$ such that $a + x = 0$ ($x$ is an “additive inverse element”)
- $a \times b = b \times a$ (“commutativity of multiplication”)
- $(a \times b) \times c = a \times (b \times c)$ (“associativity of multiplication”)
- There exists real number $1$ such that $a \times 1 = a$ (1 is a “multiplicative identity”)
- There exists a real number $y$ such that $a \times y = 1$, when $a$ is not zero ($y$ is a “multiplicative inverse”)
- $a \times (b + c) = a \times b + a \times c$ (“distributivity”)
- $1 \neq 0$
- The real numbers are separated into positive and negative subsets
- Adding and multiplying positive numbers (i.e. numbers greater than $0$) together results in a positive number
- Every real number $a$ is either positive ($a > 0$), negative ($a < 0$), or zero itself ($a = 0$)

For now, we can plug in a few values for $a$, $b$, and $c$ to get an intuition for why each of these properties hold. Again, there are ways to prove that the real numbers satisfy all of the above properties using tools of modern algebra, but without that background, what we have above is a very accessible starting point.

Also, we won’t need to use all of the given properties above in our proof, but I’ve listed them all here because a (potentially infinite) collection of numbers that satisfy the first twelve properties has a special name among mathematicians — a “field.” If that collection of numbers also satisfies the last three properties, it’s called an “ordered field”. Essentially, our assumption is that the real numbers form an ordered field.

To begin our proof, we assume our axiom — that the real numbers form an ordered field, and consequently fulfill the fifteen properties above.

To start, by properties (5) and (9) above, we know that real numbers $0$ and $1$ exist. By property (15), we know that $1$ is either positive, negative, or zero. By property (12), we know that $1 \neq 0$. That leaves two possibilities: either $1$ is positive, and $1 > 0$; or $1$ is negative, and $1 < 0$.

We now proceed by a technique known as “proof by contradiction.” Essentially, we assume something we wish to show is untrue to be true, and **use the assumed truth to prove something that that we know for sure is untrue**. The logical consequence of this kind of maneuvering is that **it must be impossible for the thing we assumed to be true to be indeed true**, because it led to an impossibility. Hence, it must be false.

If we have a few possibilities to choose from, one of which must be true, this tactic is a good way to eliminate the impossible choices and narrow down the scope of what the real possibility is.

If proof by contradiction sounds complicated, it is — but it’s also an essential mathematical tool. Sometimes, the complexity of proving something directly — without contradiction — makes the problem difficult enough that it actually can be easier to show that the alternative possibilities simply can’t be true.

Let’s assume that $1 < 0$ — that $1$ is negative — and show that it leads to an impossibility. One potential impossibility that we could demonstrate is that this assumption implies that $1 \geq 0$, because by property (15), $1$ cannot be both less than zero and greater than or equal to zero at the same time.

By property (6), there exists a real number $x$ such that $1 + x = 0$.

We can add $x$ to both sides to get $1 + x < 0 + x$.

Since property (5) tells us that $0 + x = x$, we can simplify the inequality to $0 < x$.

We can’t say just yet that $x$ must be $-1$, though — property (6) only says that there *exists* a real number $x$. We need to prove it.

A lemma is an intermediate truth that we can use to demonstrate a proof of a larger result. Whether something is called a theorem or lemma isn’t necessarily well-defined, but in general lemmas “help” us to prove what we really want.

In our case, to prove that the $x$ in property (6) is unique — specifically, that there exists only *one* real number $x$ such that $1 + x = 0$ (and consequently, that real number $x$ must be $-1$), we can again proceed by contradiction.

Suppose that there exists another real number $z$, where $z \neq x$, such that $1 + z = 0$. Now, consider the expression $x + 1 + z$. Since equality is reflexive — that is, $a = a$ for all $a$ — we know that $x + 1 + z = x + 1 + z$.

By property (4), associativity of addition, we can group the terms as $(x + 1) + z = x + (1 + z)$.

By property (3), commutativity of addition, we can rearrange the first quantity to get $(1 + x) + z = x + (1 + z)$.

Since $1 + x$ and $1 + z$ both equal zero, we have $0 + z = x + 0$, and by property (5), the additive identity element, $z = x$. However, we assumed $z \neq x$, so we have a contradiction!

Thus, there can exist only *one* real number $x$ such that $1 + x = 0$. If we replace every instance of $1$ in the lines above with an arbitrary real number $a$, this lemma demontrates that for any real number $a$, there exists a *unique* $x$ such that $a + x = 0$. Since this $x$ is unique, we can safely give this $x$ a unique name, $-a$, resulting in the familiar notion of negatives, where $a + (-a) = 0$. In our specific case, this shows that $x$ must equal $-1$.

Applying the results of the above lemma, our inequality from before, $0 < x$, becomes $0 < -1$.

By property (14), the product of positive numbers is positive, so $0 < (-1)(-1)$. We can’t say just yet that “two negatives cancel each other out,” though — none of the axioms imply that! We need to *prove* that $(-1)(-1) = (1)(1)$. We’ll need another lemma.

In the general case, for any real number $a$, we need to show that $(-a)(-a) = (a)(a) = a^2$. Property (6) — the assumption that every element has an additive inverse — deals with negative signs, and could provide an interesting avenue to show this.

If you feel like you’re getting the hang of things, feel free to stop here and try to use the axioms to prove some of the intermediate results on your own. If you get stuck, you can always scroll down!

Since additive inverses are unique, we know that there is a unique real number $-a^2$ such that $a^2 + (-a^2) = 0$.

By property (3), the commutativity of addition, we have $-a^2 + a^2 = 0$.

The previous lemma told us that if $-a^2 + x = 0$, then $x$ is unique, so if we have an expression of the form $-a^2 + x = 0$, we must have $x = a^2$. Thus, if we can show that $-a^2 + (-a)(-a) = 0$, we’ll know for sure that $(-a)(-a) = a^2$.

Let’s work with the expression $-a^2 + (-a)(-a)$. We need to somehow split $-a^2$ into its constituent terms to factor it, so we need yet another lemma — to prove that $-a^2 = -a(a)$.

For this lemma, we’ll take a similar approach to the one we started above, using the uniqueness of additive inverses to show that one product must equal another product. Since $-a^2$ is the unique additive inverse of $a^2$, if we show that $a^2 + (-a)(a) = 0$, then $(-a)(a) = -a^2$.

Note that $a^2 = a(a)$, so by property (7), the commutativity of multiplication, we have $a^2 + (-a)(a) = a(a) + a(-a)$.

By property (11), we can factor $a(a) + a(-a)$ into $a(a + (-a))$.

By property (6), $a + (-a) = 0$, so we have $a^2 + (-a)(a) = a0$.

We’d be done if $a0 = 0$, but we haven’t proved that yet!

By property (5), $0 + 0 = 0$. Thus, we can write $a0 = a(0 + 0)$.

By property (11), this distributes to $a0 = a0 + a0$.

By property (6), there exists a unique additive inverse $-a0$ of $a0$, so we can add it to both sides of our equation to get $a0 + (-a0) = a0 + a0 + (-a0)$.

Simplifying, we get $0 = a0$.

With that, we can conclude that $a^2 + (-a)(a) = a0 = 0$, so $(-a)(a) = -a^2$.

Bringing that into the previous lemma, we have $-a^2 + (-a)(-a) = -a(a) + (-a)(-a)$.

By property (11), we can then factor this expression into $-a^2 + (-a)(-a) = -a(a + (-a))$.

By property (6), putting the additive inverses together, we have $-a^2 + (-a)(-a) = -a0$, so $-a^2 + (-a)(-a) = 0$.

Thus, $(-a)(-a)$ is the unique additive inverse of $-a^2$, and hence $(-a)(-a) = a^2$.

Unwrapping all the way to the top, we left off at $0 < (-1)(-1)$. This last lemma tells us that $(-1)(-1) = (1)(1)$. By property (9), the multiplicative identity element, $(1)(1) = 1$. Thus, we have $0 < 1$, so $1 > 0$.

This is a contradiction, because we assumed that $1 < 0$! By property (15), every real number is either positive, negative, or zero — no number can be both positive and negative at the same time! Thus, we have an impossibility, and our original assumption — $1 < 0$ — cannot hold. We can eliminate that possibility, leaving only one remaining case: $1 > 0$. Since we know that every real number must fall into one of the three cases, and we’ve eliminated two of them, we must have $1 > 0$.

As Peter Thiel so nicely put it, how fresh and strange.

Leave a Comment · Back to Home

I write about college, startups, Silicon Valley, and my hometown in suburban Philadelphia. Add your email to get my latest in your inbox.