January 28, 2013

To Infinity… And Beyond!

Exploring the outer limits

“It is known that there are an infinite number of worlds, simply because there is an infinite amount of space for them to be in. However, not every one of them is inhabited. Therefore, there must be a finite number of inhabited worlds.

Any finite number divided by infinity is as near to nothing as makes no odds, so the average population of all the planets in the universe can be said to be zero. From this it follows that the population of the whole universe is also zero, and that any people you may meet from time to time are merely the products of a deranged imagination.”
– The Restaurant at the End of the Universe, Douglas Adams

If there's one thing mathematicians have a love-hate relationship with, it has to be infinity. It's the ultimate tease: it beckons us to come closer, but never allows us anywhere near it. No matter how far we travel to impress it, infinity remains disinterested, equally distant from everything: infinitely far!

$$ 0 < 1 < 2 < 3 < … < \infty $$

Yet infinity is not just desirable, it is absolutely necessary. All over mathematics, we find problems for which no finite amount of steps will help resolve them. Without infinity, we wouldn't have real numbers, for starters. That's a problem: our circles aren't round anymore (no $ π $ and $ \tau $) and our exponentials stop growing right (no $ e $). We can throw out all of our triangles too: most of their sides have exploded.

A steel railroad bridge with a 1200 ton counter-weight.
Completed in 1910. Source: Library of Congress.

We like infinity because it helps avoid all that. In fact even when things are not infinite, we often prefer to pretend they are—we do geometry in infinitely big planes, because then we don't have to care about where the edges are.

Now, suppose we want to analyze a steel beam, because we're trying to figure out if our proposed bridge will stay up. If we want to model reality accurately, that means simulating each individual particle, every atom in the beam. Each has its own place and pushes and pulls on others nearby.

But even just $ 40 $ grams of pure iron contains $ 4.31 \cdot 10^{23} $ atoms. That's an inordinate amount of things to keep track of for just 1 teaspoon of iron.

Instead, we pretend the steel is solid throughout. Rather than being composed of atoms with gaps in between, it's made of some unknown, filled in material with a certain density, expressed e.g. as grams per cubic centimetre. Given any shape, we can determine its volume, and hence its total mass, and go from there. That's much simpler than counting and keeping track of individual atoms, right?

Unfortunately, that's not quite true.

The Shortest Disappearing Trick Ever

Like all choices in mathematics, this one has consequences we cannot avoid. Our beam's density is mass per volume. Individual points in space have zero volume. That would mean that at any given point inside the beam, the amount of mass there is $ 0 $. How can a beam that is entirely composed of nothing be solid and have a non-zero mass?

Bam! No more iron anywhere.

While Douglas Adams was being deliberately obtuse, there's a kernel of truth there, which is a genuine paradox: what exactly is the mass of every atom in our situation?

To make our beam solid and continuous, we had to shrink every atom down to an infinitely small point. To compensate, we had to create infinitely many of them. Dividing the finite mass of the beam between an infinite amount of atoms should result in $ 0 $ mass per atom. Yet all these masses still have to add up to the total mass of the beam. This suggests $ 0 + 0 + 0 + … > 0 $, which seems impossible.

If the mass of every atom were not $ 0 $, and we have infinitely many points inside the beam, then the total mass is infinity times the atomic mass $ m $. Yet the total mass is finite. This suggests $ m + m + m + … < \infty $, which also doesn't seem right.

It seems whatever this number $ m $ is, it can't be $ 0 $ and can't be non-zero. It's definitely not infinite, we only had a finite mass to begin with. It's starting to sound like we'll have to invent a whole new set of numbers again to even find it.

That's effectively what Isaac Newton and Gottfried Leibniz set in motion at the end of the 17th century, when they both discovered calculus independently. It was without a doubt the most important discovery in mathematics and resulted in formal solutions to many problems that were previously unsolvable— our entire understanding of physics has relied on it since. Yet it took until the late 19th century for the works of Augustin Cauchy and Karl Weierstrass to pop up, which formalized the required theory of convergence. This allows us to describe exactly how differences can shrink down to nothing as you approach infinity. Even that wasn't enough: it was only in the 1960s when the idea of infinitesimals as fully functioning numbers—the hyperreal numbers—was finally proven to be consistent enough by Abraham Robinson.

But it goes back much further. Ancient mathematicians were aware of problems of infinity, and used many ingenious ways to approach it. For example, $ π $ was found by considering circles to be infinite-sided polygons. Archimedes' work is likely the earliest use of indivisibles, using them to imagine tiny mechanical levers and find a shape's center of mass. He's better known for running naked through the streets shouting Eureka! though.

That it took so long shows that this is not an easy problem. The proofs involved are elaborate and meticulous, all the way back. They have to be, in order to nail down something as tricky as infinity. As a result, students generally learn calculus through the simplified methods of Newton and Leibniz, rather than the most mathematically correct interpretation. We're taught to mix notations from 4 different centuries together, and everyone's just supposed to connect the dots on their own. Except the trail of important questions along the way is now overgrown with jungle.

Still, it shows that even if we don't understand the whole picture, we can get a lot done. This article is in no way a formal introduction to infinitesimals. Rather, it's a demonstration of why we might need them.

What is happening when we shrink atoms down to points? Why does it make shapes solid yet seemingly hollow? Is it ever meaningful to write $ x = \infty $? Is there only one infinity, or are there many different kinds?

To answer that, we first have to go back to even simpler times, to Ancient Greece, and start with the works of Zeno.

Achilles and the Tortoise

Zeno of Elea was one of the first mathematicians to pose these sorts of questions, effectively trolling mathematics for the next two millennia. He lived in the 5th century BC in southern Italy, although only second-hand references survive. In his series of paradoxes, he examines the nature of equality, distance, continuity, of time itself.

Because it's the ancient times, our mathematical knowledge is limited. We know about zero, but we're still struggling with the idea of nothing. We've run into negative numbers, but they're clearly absurd and imaginary, unlike the positive numbers we find in geometry. We also know about fractions and ratios, but square roots still confuse us, even though our temples stay up.

So the story goes: the tortoise challenges Achilles to a footrace.

"If you give me a head start," it says, "any start at all, you can never win.".
Achilles laughs and decides to be a good sport: he'll only run twice as fast as the tortoise.

The tortoise explains: "If you want to pass me, first you have to move to where I am. By the time you get there, I'll have walked ahead a little bit."

"While you cross the next distance, I will move yet again. No matter how many times you try to catch up, I'll always be some small distance ahead. Therefor, you cannot beat me."

Achilles realizes that talking tortoises are not a sign of positive mental health, so he decides to find a wall to run into instead. It will either confirm the theory, or end the pain.

See, the race is actually unnecessary, because the problem remains the same.
In order to reach the wall, Achilles first has to cross half the way there.

Then he has to go half that distance again, and again. No matter how many times he repeats this, there will always be some distance left. So if Achilles can't cross this distance in a finite amount of steps, why is he wearing that stupid helmet?

$$ … $$

The distance travelled forms a never ending sequence of expanding sums.
We have to examine the entire sequence, rather than individual numbers in it.

By definition, the distance travelled and distance to the wall always add up to $ 1 $. So one simple way to resolve this conundrum is to say: Well yes, it's going to take you infinitely long to glue all those pieces together, but only because you already spent an infinite amount of time chopping them up!
But that's not a very mathematically satisfying answer. Let's try something else.

The distance to the wall is always equal to the last step taken. We know that each step is half as long as the previous one, starting with $ \frac{1}{2} $. Therefor, the distance to the wall must decrease exponentially: $ \frac{1}{2}, \frac{1}{4}, \frac{1}{8}, \frac{1}{16}, … $, getting closer to zero with every step.

But why can we say that this gap effectively closes to zero after 'infinity steps'? The number that we're building up is $ \frac{1}{2} + \frac{1}{4} + \frac{1}{8} + \frac{1}{16} + … \,$

We know our sum will never exceed $ 1 $, as there is only $ 1 $ unit of distance being divided. This means $ \frac{1}{2} + \frac{1}{4} + \frac{1}{8} + \frac{1}{16} + … \leq 1 $, which eliminates every number past the surface of the wall—but not the surface itself.

Suppose we presume $ \frac{1}{2} + \frac{1}{4} + \frac{1}{8} + \frac{1}{16} + … < 1 $ and hence that this number lies some tiny distance in front of the wall.

Well in that case, all we need to do is zoom in far enough, and we'll see our sequence jump past it after a certain finite number of steps.

If we try to move it closer to the wall, the same thing happens. This number simply cannot be less than $ 1 $. Therefor $ \frac{1}{2} + \frac{1}{4} + \frac{1}{8} + \frac{1}{16} + … \geq 1 $

The only place $ \frac{1}{2} + \frac{1}{4} + \frac{1}{8} + \frac{1}{16} + … \, $ can be is exactly $ 0 $ units away from $ 1 $. If two numbers have zero distance between them, then they are equal.

$$ … $$

What we've actually done here is applied the principle of limits: we've defined a procedure of steps that lets us narrow down the interval where the infinite sum might be. The lower bound is the sequence of sums itself: it only increases towards $ 1 $, never decreases. For the upper bound, we established no sum could exceed $ 1 $. Therefor the interval must shrink to nothing, and the sequence converges.

$$ \lim_{n \to +\infty} x_n = \mathop{\class{no-outline}{►\hspace{-2pt}►}}_{\infty\hspace{2pt}} x_n $$

The purpose of a limit is then to act as a supercharged fast-forward button. It lets us avoid the infinite amount of work required to complete sums like $ \frac{1}{2} + \frac{1}{4} + \frac{1}{8} + \frac{1}{16} + … $ and simply skip to the end. To do so, we have to step back, spot the pattern, and pin down where it ends. So limits allow us to literally reach the unreachable. But in fact, you already knew that.

$$ \frac{2}{3} = 0.66666… $$

$$ 0.6 + 0.06 + 0.006 + …\hspace{2pt} $$

As soon as you learned to divide, you found $ 2 \div 3 = 0.666… = 0.6 + 0.06 + 0.006 + …\hspace{2pt} $
Even in primary school the opportunity to examine infinity is there. Rather than tackle it head on, it's simply noted and filed. Eight years later it's regurgitated in the form of cryptic epsilon-delta definitions.

$$ 1 - 1 + 1 - 1 + 1 … $$

But then there's those pesky consequences again. By allowing the idea of infinity, we can invent an entire zoo of paradoxical things. For example, imagine a lamp that's switched on ($1$) and off ($0$) at intervals that decrease by a factor of two: on for $ \frac{1}{2} $ second, off for $ \frac{1}{4} s $, on for $ \frac{1}{8} s $, off for $ \frac{1}{16} s $, …
After $ 1\,s $, when the switch has been flipped an infinite amount of times, is the lamp on or off?

$$ (1 - 1) + (1 - 1) + (1 - 1) + … = 0 \,? $$

$$ 1 + (-1 + 1) + (-1 + 1) + … = 1 \,? $$

Another way to put this is that the lamp's state at $ 1\,s $ is the result of the infinite sum $ 1 - 1 + 1 - 1 + … $
Intuitively we might say each pair of $ +1 $ and $ -1 $ should cancel out and make the entire sum equal to $ 0 $.
But we can pair them the other way, leading to $ 1 $ instead. It can't be both.

If we zoom in, it's obvious that no matter how close we get to $ 1\,s $, the lamp's state keeps switching. Therefor it's meaningless to attempt to 'fast forward' to the end, and the limit does not exist. At $ 1\,s $ the lamp is neither on nor off: it's undefined. This infinite sum does not converge.

But actually, we overcomplicated things. Thanks to the power of limits, we can ask a simpler, equivalent question. Given a lamp that switches on and off every second, what is its state at infinity? The answer's the same: it never settles.

Limits are the first tool in our belt for tackling infinity. Given a sequence described by countable steps, we can attempt to extend it not just to the end of the world, but literally forever. If this works we end up with a finite value. If not, the limit is undefined. A limit can be equal to $ \infty $, but that's just shorthand for the sequence has no upper bound. Negative infinity means no lower bound.

Breaking Away From Rationality

Until now we've only encountered fractions, that is, rational numbers. Each of our sums was made of fractions. The limit, if it existed, was also a rational number. We don't know whether this was just a coincidence.

It might seem implausible that a sequence of numbers that is 100% rational and converges, can approach a limit that isn't rational at all. Yet we've already seen similar discrepancies. In our first sequence, every partial sum was less than $ 1 $. Meanwhile the limit of the sum was equal to $ 1 $. Clearly, the limit does not have to share all the properties of its originating sequence.

We also haven't solved our original problem: we've only chopped things up into infinitely many finite pieces. How do we get to infinitely small pieces? To answer that, we need to go looking for continuity.

Generally, continuity is defined by what it is and what its properties are: a noticeable lack of holes, and no paradoxical values. But that's putting the cart before the horse. First, we have to show which holes we're trying to plug.

Let's imagine the rational numbers.

Actually, hold on. Is this really a line? The integers certainly weren't connected.

Rather than assume anything, we're going to attempt to visualize all the rational numbers. We'll start with the numbers between $ 0 $ and $ 1 $.

$$ \class{blue}{\frac{0 + 1}{2}} $$

Between any two numbers, we can find a new number in between: their average. This leads to $ \frac{1}{2} $.

$$ \frac{a + b}{2} $$

By repeatedly taking averages, we keep finding new numbers, filling up the interval.

If we separate out every step, we get a binary tree.

You can think of this as a map of all the fractions of $ 2^n $. Given any such fraction, say $ \frac{13}{32} = \frac{13}{2^5} $, there is a unique path of lefts and rights that leads directly to it. At least, as long as it lies between $ 0 $ and $ 1 $.

Note that the graph resembles a fractal and that the distance to the top edge is divided in half with every step. But we only ever explore a finite amount of steps. Therefor, we are not taking a limit and we'll never actually touch the edge.

$$ \frac{2 \cdot a + b}{3} $$

$$ \frac{a + 2 \cdot b}{3} $$

But we can take thirds as well, leading to fractions with a power of $ 3^n $ in their denominator.

As some numbers can be reached in multiple ways, we can eliminate some lines, and end up with this graph, where every number sprouts into a three-way, ternary tree. Again, we have a map that gives us a unique path to any fraction of $ 3^n $ in this range, like $ \frac{11}{27} = \frac{11}{3^3} $.

$$ \frac{21}{60} = \frac{21}{2^2 \cdot 3 \cdot 5} $$

Because we can do this for any denominator, we can define a way to get to any rational number in a finite amount of steps. Take for example $ \frac{21}{60} $. We decompose its denominator into prime numbers and begin with $ 0 $ and $ 1 $ again.

$$ \frac{21}{60} = \frac{21}{2^2 \cdot 3 \cdot 5} $$

There is a division of $ 2^2 $, so we do two binary splits. This time, I'm repeating the previously found numbers so you can see the regular divisions more clearly. We get quarters.

The next factor is $ 3 $ so we divide into thirds once. We now have twelfths.

For the last division we chop into fifths and get sixtieths.

$ \frac{21}{60} $ is now the 21st number from the left.

But this means we've found a clear way to visualize all the rational numbers between $ 0 $ and $ 1 $: it's all the numbers we can reach by applying a finite number of binary (2), ternary (3), quinary (5) etc. divisions, for any denominator. So there's always a finite gap between any two rational numbers, even though there are infinitely many of them.

The rational numbers are not continuous. Therefor, it is more accurate to picture them as a set of tick marks than a connected number line.

To find continuity then, we need to revisit one of our earlier trees. We'll pick the binary one.
While every fork goes two ways, we actually have a third choice at every step: we can choose to stop. That's how we get a finite path to a whole fraction of $ 2^n $.

But what if we never stop? We have to apply a limit: we try to spot a pattern and try to fast-forward it. Note that by halving each step vertically on the graph, we've actually linearized each approach into a straight line which ends. Now we can take limits visually just by intersecting lines with the top edge.

Right away we can spot two convergent limits: by always choosing either the left or the right branch, we end up at respectively $ 0 $ and $ 1 $.

These two sequences both converge to $ \frac{1}{2} $. It seems that 'at infinity steps', the graph meets up with itself in the middle.

But the graph is now a true fractal. So the same convergence can be found here. In fact, the graph meets up with itself anywhere there is a multiple of $ \frac{1}{2^n} $.

That's pretty neat: now we can eliminate the option of stopping altogether. Instead of ending at $ \frac{5}{16} $, we can simply take one additional step in either direction, followed by infinitely many opposite steps. Now we're only considering paths that are infinitely long.

But if this graph only leads to fractions of $ 2^n $, then there must be gaps between them. In the limit, the distance between any two adjacent numbers in the graph shrinks down to exactly $ 0 $, which suggests there are no gaps. This infinite version of the binary tree must lead to a lot more numbers than we might think.
Suppose we take a path of alternating left and right steps, and extend it forever. Where do we end up?

We can apply the same principle of an upper and lower bound, but now we're approaching from both sides at once. Thanks to our linearization trick, the entire sequence fits snugly inside a triangle.

If we zoom into the convergence at infinity, we actually end up at $ \class{orangered}{\frac{2}{3}} $.
Somehow we've managed to coax a fraction of $ 3 $ out of a perfectly regular binary tree.

If we alternate two lefts with one right, we can end up at $ \class{orangered}{\frac{4}{7}} $. This is remarkable: when we tried to visualize all the rational numbers by combining all kinds of divisions, we were overthinking it. We only needed to take binary divisions and repeat them infinitely with a limit.

Every single rational number can then be found by taking a finite amount of steps to get to a certain point, and then settling into a repeating pattern of lefts and/or rights all the way to infinity.

If we can find numbers between $ 0 $ and $ 1 $ this way, we can apply the exact same principle to the range $ 1 $ to $ 2 $. So we can connect two of these graphs into a single graph with its tip at $ 1 $.

But we can repeat it as much as we like. The full graph is not just infinitely divided, but infinitely big, in that no finite box can contain it. That means it leads to every single positive rational number. We can start anywhere we like. Is your mind blown yet?

No? Ok. But if this works for positives, we can build a similar graph for the negatives just by mirroring it. So we now have a map of the entire rational number set. All we need to do is take infinite paths that settle into a repeating pattern from either a positive or a negative starting point. When we do, we find every such path leads to a rational number.
So any rational number can be found by taking an infinite stroll on one of two infinite binary trees.

Wait, did I say two infinite trees? Sorry, I meant one infinitely big tree.
See, if we repeatedly scale up a fractal binary tree and apply a limit to that, we end up with almost exactly the same thing. Only this time, the two downward diagonals always eventually fold back towards $ 0 $. This creates a path of infinity + 1 steps downward. While that might not be very practical, it suggests you can ride out to the restaurant at the end of the universe, have dinner, and take a single step to get back home.

Is it math, or visual poetry? It's time to bring this fellatio of the mind to its inevitable climax.

$ \class{blue}{0} $

$ \class{green}{1} $

$ \class{blue}{0} $

$ \class{green}{1} $

$ \class{blue}{0} $

$ \class{green}{1} $

You may wonder, if this map is so amazing, how did we ever do without?
Let's label our branches. If we go left, we call it $ 0 $. If we go right, we call it $ 1 $.

$$ \frac{5}{3} = \class{green}{11}\class{blue}{0}\hspace{2pt}\class{green}{1}\class{blue}{0}\hspace{2pt}\class{green}{1}\class{blue}{0}… $$

We can then identify any number by writing out the infinite path that leads there as a sequence of ones and zeroes—bits.

But you already knew that.

$$ \frac{5}{3} = \class{green}{1}.\class{green}{1}\class{blue}{0}\hspace{2pt}\class{green}{1}\class{blue}{0}\hspace{2pt}\class{green}{1}\class{blue}{0}…_2 $$

See we've just rediscovered the binary number system. We're so used to numbers in decimal, base 10, we didn't notice. Yet we all learned that rational numbers consist of digits that settle into a repeating sequence, a repeating pattern of turns. Disallowing finite paths works the same, even in decimal: the number $ 0.95 $ can be written as $\, 0.94999…\, $, i.e. take one final step in one direction, followed by infinitely many steps the other way.

$$ \frac{4}{5} = \class{blue}{0}.\class{green}{11}\class{blue}{00}\hspace{2pt}\class{green}{11}\class{blue}{00}…_2 $$

When we write down a number digit by digit, we're really following the path to it in a graph like this, dialing the number's … er … number. The rationals aren't shaped like a binary tree, rather, they look like a binary tree when viewed through the lens of binary division. Every infinite binary, ternary, quinary, etc. tree is then a different but complete perspective of the same underlying thing. We don't have the map, we have one of infinitely many maps.

$$ π = \class{green}{11}.\class{blue}{00}\class{green}{1}\class{blue}{00}\class{green}{1}\class{blue}{0000}\class{green}{1}…_2 $$

Which means we can show this graph is actually an interdimensional number portal.
See, we already know where the missing numbers are. Irrational numbers like $ π $ form a never-repeating sequence of digits. If we want to reach $ π $, we find it's at the end of an infinite path whose turns do not repeat. By allowing such paths, our map leads us straight to them. Even though it's made out of only one kind of rational number: division by two.

$$ π = \mathop{\class{no-outline}{►\hspace{-2pt}►}}_{\infty\hspace{2pt}} x_n \,? $$

So now we've invented real numbers. How do we visualize this invention? And where does continuity come in? What we need is a procedure that generates such a non-repeating path when taken to the limit. Then we can figure out where the behavior at infinity comes from.

Because the path never settles into a pattern, we can't pin it down with a single neat triangle like before. We try something else. At every step, we can see that the smallest number we can still reach is found by always going left. Similarly, the largest available number is found by always going right. Wherever we go from here, it will be somewhere in this range.

We can set up shrinking intervals by placing such triangles along the path, forming a nested sequence.

$$ \begin{align} 3 \leq & π \leq 4 \\ 3.1 \leq & π \leq 3.2 \\ 3.14 \leq & π \leq 3.15 \\ 3.141 \leq & π \leq 3.142 \\ 3.1415 \leq & π \leq 3.1416 \\ 3.14159 \leq & π \leq 3.14160 \\ \end{align} $$

$$ \begin{align} 11_2 \leq & π \leq 100_2 \\ 11.0_2 \leq & π \leq 11.1_2 \\ 11.00_2 \leq & π \leq 11.01_2 \\ 11.001_2 \leq & π \leq 11.010_2 \\ 11.0010_2 \leq & π \leq 11.0011_2 \\ 11.00100_2 \leq & π \leq 11.00101_2 \\ \end{align} $$

What we've actually done is rounded up and down at every step, to find an upper and lower bound with a certain amount of digits. This works in any number base.

Let's examine these intervals by themselves. We can see that due to the binary nature, each interval covers either the left or right side of its ancestor. Because our graph goes on forever, there are infinitely many nested intervals. This tower of $ π $ never ends and never repeats itself, we just squeezed it into a finite space so we could see it better.

If we instead approach a rational number like $ \frac{10}{3} = 3.333…\, $ then the tower starts repeating itself at some point. Note that the intervals don't slide smoothly. Each can only be in one of two places relative to its ancestor.

In order to reach a different rational number, like $ 3.999… = 4 $, we have to establish a different repeating pattern. So we have to rearrange infinitely many levels of the tower all at once, from one configuration to another. This reinforces the notion that rational numbers are not continuous.

If the tower converges to a number, then the top must be infinitely thin, i.e. $ 0 $ units wide. That would suggest it's meaningless to say what the interval at infinity looks like, because it stops existing. Let's try it anyway.

There is only one question to answer: does the interval cover the left side, or the right?

Oddly enough, in this specific case of $ 3.999…\, $ there is an answer. The tower leans to the right. Therefor, the state of the interval is the same all the way up. If we take the limit, it converges and the final interval goes right.

But we can immediately see that we can build a second tower that leans left, which converges on the same number. We could distinguish between the two by writing it as $ 4.000…\, $ In this case the final interval goes left.

If we approach $ 10/3 $, we take a path of alternating left and right steps. The state of the interval at infinity becomes like our paradoxical lamp from before: it has to be both left and right, and therefor it is neither, it's simply undefined.

The same applies to irrational numbers like $ π $. Because the sequence of turns never repeats itself, the interval flips arbitrarily between left and right forever, therefor it is in an undefined state at the end.

But there's another way to look at this.
If the interval converges to the number $ π $, then the two sequences of respectively lower and upper bounds also converge to $ π $ individually.

Remember how we derived our bounds: we rounded down by always taking lefts and rounded up by always taking rights. The shape of the tower depends on the specific path you're taking, not just the number you reach at the end.

That means we're approaching the lower bounds so they all end in $ 0000… \, $ Their towers always lean left.

If we then take the limit of their final intervals as we approach $ π $, that goes left too. Note that this is a double limit: first we find the limit of the intervals of each tower individually, then we take the limit over all the towers as we approach $ π $.

For the same reason, we can think of all the upper bounds as ending in $ 1111 …\, $ Their towers always lean right. When we take the limit of their final intervals and approach $ π $, we find it points right.

But, we could actually just reverse the rounding for the upper and lower bounds, and end up with the exact opposite situation. Therefor it doesn't mean that we've invented a red $ π $ to the left and green $ π $ to the right which are somehow different. $ π $ is $ π $. This only says something about our procedure of building towers. It matters because the towers is how we're trying to reach a real number in the first place.

See, our tower still represents a binary number of infinitely many bits. Every interval can still only be in one of two places. To run along the real number line, we'd have to rearrange infinitely many levels of the tower all at once to create motion. That still does not seem continuous.

We can resolve this if we picture the final interval of each tower as a bit at infinity. If we flip the bit at infinity, we swap between two equivalent ways of reaching a number, so this has no effect on the resulting number.

In doing so, we're actually imagining that every real number is a rational number whose non-repeating head has grown infinitely big. Its repeating tail has been pushed out all the way past infinity. That means we can flip the repeating part of our tower between different configurations without creating any changes in the number it leads to.

That helps a little bit with the intuition: if the tower keeps working all the way up there, it must be continuous at its actual tip, wherever that really is. A continuum is then what happens when the smallest possible step you can take isn't just as small as you want. It's so small that it no longer makes any noticeable difference. While that's not a very mathematical definition, I find it very helpful in trying to imagine how this might work.

$ 1, 2, 3, 4, 5, 6, … $

Finally, we might wonder how many of each type of number there are.
The natural numbers are countably infinite: there is a procedure of steps which, in the limit, counts all of them. Just start at the beginning, and fast-forward.

$$ 1, 2, 3, 4, 5, 6, … $$

$$ \class{orangered}{2, 4, 6, 8, 10, 12, …} $$

$$ \class{green}{0, 1, -1, 2, -2, 3, …} $$

We can find a similar sequence for the even natural numbers by multiplying each number by two. We can also alternate between a positive and negative sequence to count the integers. We can match up the elements one-to-one, which means all three sequences are equally long. They're all countably infinite.
There are as many even positives as positives. Which is exactly as many as all the integers combined. As counter-intuitive as it is, it is the only consistent answer.

$$ \begin{array}{cccccccc} 1 \hspace{2pt}&\hspace{2pt} 2 \hspace{2pt}&\hspace{2pt} 3 \hspace{2pt}&\hspace{2pt} 4 \hspace{2pt}&\hspace{2pt} 5 \hspace{2pt}&\hspace{2pt} 6 \hspace{2pt}&\hspace{2pt} … \\[6pt] \frac{1}{2} \hspace{2pt}&\hspace{2pt} \class{grey}{\frac{2}{2}} \hspace{2pt}&\hspace{2pt} \frac{3}{2} \hspace{2pt}&\hspace{2pt} \class{grey}{\frac{4}{2}} \hspace{2pt}&\hspace{2pt} \frac{5}{2} \hspace{2pt}&\hspace{2pt} \class{grey}{\frac{6}{2}} \hspace{2pt}&\hspace{2pt} \\[3pt] \frac{1}{3} \hspace{2pt}&\hspace{2pt} \frac{2}{3} \hspace{2pt}&\hspace{2pt} \class{grey}{\frac{3}{3}} \hspace{2pt}&\hspace{2pt} \frac{4}{3} \hspace{2pt}&\hspace{2pt} \frac{5}{3} \hspace{2pt}&\hspace{2pt} \class{grey}{\frac{6}{3}} \hspace{2pt}&\hspace{2pt} \cdots \\[3pt] \frac{1}{4} \hspace{2pt}&\hspace{2pt} \class{grey}{\frac{2}{4}} \hspace{2pt}&\hspace{2pt} \frac{3}{4} \hspace{2pt}&\hspace{2pt} \class{grey}{\frac{4}{4}} \hspace{2pt}&\hspace{2pt} \frac{5}{4} \hspace{2pt}&\hspace{2pt} \class{grey}{\frac{6}{4}} \hspace{2pt}&\hspace{2pt} \\[3pt] \frac{1}{5} \hspace{2pt}&\hspace{2pt} \frac{2}{5} \hspace{2pt}&\hspace{2pt} \frac{3}{5} \hspace{2pt}&\hspace{2pt} \frac{4}{5} \hspace{2pt}&\hspace{2pt} \class{grey}{\frac{5}{5}} \hspace{2pt}&\hspace{2pt} \frac{6}{5} \hspace{2pt}&\hspace{2pt} \\[3pt] \frac{1}{6} \hspace{2pt}&\hspace{2pt} \class{grey}{\frac{2}{6}} \hspace{2pt}&\hspace{2pt} \class{grey}{\frac{3}{6}} \hspace{2pt}&\hspace{2pt} \class{grey}{\frac{4}{6}} \hspace{2pt}&\hspace{2pt} \frac{5}{6} \hspace{2pt}&\hspace{2pt} \class{grey}{\frac{6}{6}} \hspace{2pt}&\hspace{2pt} \\[3pt] \hspace{2pt}&\hspace{2pt} \vdots \hspace{2pt}&\hspace{2pt} \hspace{2pt}&\hspace{2pt} \vdots \hspace{2pt}&\hspace{2pt} \hspace{2pt}&\hspace{2pt} \hspace{2pt}&\hspace{2pt} \hspace{2pt}&\hspace{2pt} \class{white}{\ddots} \end{array} $$

But we can take it one step further: we can find such a sequence for the rational numbers too, by laying out all the fractions on a grid. We can follow diagonals up and down and pass through every single one. If we eliminate duplicates like $ 1 = 2/2 = 3/3 $ and alternate positives and negatives, we can 'count them all'. So there are as many fractions as there are natural numbers. "Deal with it", says Infinity, donning its sunglasses.

$$ \begin{array}{c} 0.\hspace{1pt}\class{green}{1}\hspace{1pt}0\hspace{1pt}0\hspace{1pt}1\hspace{1pt}1\hspace{1pt}1\hspace{1pt}0\hspace{1pt}…_2 \\ 0.\hspace{1pt}1\hspace{1pt}\class{blue}{0}\hspace{1pt}0\hspace{1pt}1\hspace{1pt}0\hspace{1pt}0\hspace{1pt}1\hspace{1pt}…_2 \\ 0.\hspace{1pt}1\hspace{1pt}0\hspace{1pt}\class{green}{1}\hspace{1pt}0\hspace{1pt}0\hspace{1pt}1\hspace{1pt}0\hspace{1pt}…_2 \\ 0.\hspace{1pt}0\hspace{1pt}1\hspace{1pt}1\hspace{1pt}\class{green}{1}\hspace{1pt}0\hspace{1pt}1\hspace{1pt}1\hspace{1pt}…_2 \\ 0.\hspace{1pt}1\hspace{1pt}0\hspace{1pt}1\hspace{1pt}1\hspace{1pt}\class{blue}{0}\hspace{1pt}0\hspace{1pt}1\hspace{1pt}…_2 \\ 0.\hspace{1pt}0\hspace{1pt}1\hspace{1pt}0\hspace{1pt}1\hspace{1pt}0\hspace{1pt}\class{blue}{0}\hspace{1pt}0\hspace{1pt}…_2 \\ 0.\hspace{1pt}0\hspace{1pt}1\hspace{1pt}1\hspace{1pt}1\hspace{1pt}1\hspace{1pt}0\hspace{1pt}\class{green}{1}\hspace{1pt}…_2 \\ … \\ \\ 0.\hspace{1pt}\class{blue}{0}\hspace{1pt}\class{green}{1}\hspace{1pt}\class{blue}{0\hspace{1pt}0}\hspace{1pt}\class{green}{1\hspace{1pt}1}\hspace{1pt}\class{blue}{0}\hspace{1pt}…_2 \end{array} $$

The real numbers on the other hand are uncountably infinite: no process can list them all in the limit. The basic proof is short: suppose we did have a sequence of all the real numbers between $ 0 $ and $ 1 $ in some order. We could then build a new number by taking all the bits on the diagonal, and flipping zeroes and ones.
That means this number is different from every listed number in at least one digit, so it's not on the list. But it's also between $ 0 $ and $ 1 $, so it should be on the list. Therefor, the list can't exist.

This even matches our intuitive explanation from earlier. There are so many real numbers, that we had to invent a bit at infinity to try and count them, and find something that would tick at least once for every real number. Even then we couldn't say whether it was $ 0 $ or $ 1 $ anywhere in particular, because it literally depends on how you approach it.

What we just did was a careful exercise in hiding the obvious, namely the digit-based number systems we are all familiar with. By viewing them not as digits, but as paths on a directed graph, we get a new perspective on just what it means to use them. We've also seen how this means we can construct the rationals and reals using the least possible ingredients required: division by two, and limits.

Drowning By Numbers

In school, we generally work with the decimal representation of numbers. As a result, the popular image of mathematics is that it's the science of digits, not the underlying structures they represent. This permanently skews our perception of what numbers really are, and is easy to demonstrate. You can google to find countless arguments of why $ 0.999… $ is or isn't equal to $ 1 $. Yet nobody's wondering why $ 0.000… = 0 $, though it's practically the same problem: $ 0.1, 0.01, 0.001, 0.0001, … $

Furthermore, in decimal notation, rational numbers and real numbers look incredibly alike: $ 3.3333… $ vs $ 3.1415…\, $ The question of what it actually means to have infinitely many non-repeating digits, and why this results in continuous numbers, is hidden away in those 3 dots at the end. By imagining $ π $ as $ 3.1415…0000… $ or $ 3.1415…1111… $ we can intuitively bridge the gap to the infinitely small. We see how the distance between two neighbouring real numbers must be so small, that it really is equivalent to $ 0 $.

That's not as crazy as it sounds. In the field of hyperreal numbers, every number actually has additional digits 'past infinity': that's its infinitesimal part. You can imagine this to be a multiple of $ \frac{1}{\infty} $, an infinitely small unit greater than $ 0 $, which I'll call $ ε $. You can add $ ε $ to a real number to take an infinitely small step. It represents a difference that can only be revealed with an infinitely strong microscope. Equality is replaced with adequality: being equal aside from an infinitely small difference.

You can explore this hyperreal number line below.

As $ ε $ is a fully functioning hyperreal number, $ ε^2 $ is also infinitesimal. In fact, it's even infinitely smaller than $ ε $, and we can keep doing this for $ ε^3, ε^4, …\,$ To make matters worse, if $ ε $ is infinitesimal, then $ \frac{1}{ε} $ must be infinitely big, and $ \frac{1}{ε^2} $ infinitely bigger than that. So hyperreal numbers don't just have inwardly nested infinitesimal levels, but outward levels of increasing infinity too. They have infinitely many dimensions of infinity both ways.

So it's perfectly possible to say that $ 0.999… $ does not equal $ 1 $, if you mean they differ by an infinitely small amount. The only problem is that in doing so, you get much, much more than you bargained for.

A Tug of War Between the Gods

That means we can finally answer the question we started out with: why did our continuous atoms seemingly all have $ 0 $ mass, when the total mass was not $ 0 $? The answer is that the mass per atom was infinitesimal. So was each atom's volume. The density, mass per volume, was the result of dividing one infinitesimal amount by another, to get a normal sized number again. To create a finite mass in a finite volume, we have to add up infinitely many of these atoms.

These are the underlying principles of calculus, and the final puzzle piece to cover. The funny thing about calculus is, it's conceptually easy, especially if you start with a good example. What is hard is actually working with the formulas, because they can get hairy very quickly. Luckily, your computer will do them for you:

We're going to go for a drive.

We'll graph speed versus time. We have kilometers per hour vertically, and hours horizontally. We've also got a speedometer—how fast—and an odometer—how far.

Suppose we drive for half an hour at 50 km/h.

$ \class{orangered}{25} $

We end up driving for 25 km. This is the area of spanned by the two lengths: $ 50 \cdot \frac{1}{2} $, a rectangle.

$ \class{orangered}{60} $

Now we hit the highway and maintain 120 km/h for the rest of the hour. We go an additional 60 km, the area of the second rectangle, $ 120 \cdot \frac{1}{2} $.
Whenever we multiply two units like speed and time, we can always visualize the result as an area.

$ \class{slate}{85} $

Because we crossed 85 km in one hour, this is equivalent to driving at a constant speed of 85 km/h for the duration. The total area is the same.

If this were a race between two different cars, we would see a photo finish. The distance travelled in kilometers is identical at the 1 hour mark. Where they differ is in their speed along the way, with the red car falling behind and then catching up.

The difference is visible in the slope of both paths. The faster the car, the more quickly it accumulates kilometers. If it drove 25 km in half an hour, then its speed was 50 km/h, $ \frac{25}{0.5} $. This is the distance travelled divided by the time it took, vertical divided by horizontal.

Slope is a relative thing. If we shrink the considered time, the distance shrinks along with it, and the resulting speed is the same. What we're really doing is formalizing the concept of a rate of change, of distance over time.

Constant speed means a constant increase in distance. We can directly relate the area being swept out left to right with the accumulated distance by each car. This is clue number 1.

Now suppose the red car starts ahead by 10 km and drives the same speeds.
It will also end up 10 km ahead after 1 hour, its path has simply been shifted by 10 units. The slope is unchanged: it doesn't matter where you are and where you've been, only how fast you're going right now. It's what's called an instantaneous quantity, it describes a situation only in the moment. This is clue number 2.

In order to get ahead, the red car had to drive there. So we can imagine it started earlier, $ \frac{1}{5} $ of an hour, driving for 10 km at the same speed. Again, the equality holds: area swept out equals accumulated distance, we add another $ 50 \cdot \frac{1}{5} $. Constant slope still equals constant speed.

One curve describes how the other changes in the moment, therefor the two quantities are linked somehow. We add up area to go from speed to distance; we find slope to go from distance to speed. We're going to examine this two-way relationship more.

Real cars don't start or stop on a dime, they accelerate and decelerate. So we're going to try more realistic behavior.

Suppose the speed follows a curve. In one hour, the car starts from 0 km/h, accelerates to over 100 km/h and then smoothly decelerates back to standstill. The distance travelled also curves smoothly, from 0 to 60 km, so we've driven 60 km in total.

We can immediately see that at the point where the car was going fastest, the distance was increasing the most. Its slope is steepest at that point. The relationship between the two curves holds.

But actually measuring it is a problem. First, there are no more straight sections to measure the slope on. If we take two points on a curve, the line that connects them doesn't touch the curve, it crosses it at an angle.

Second, we can no longer measure the area by dividing it into rectangles, or any other simple geometric shape. There will always be gaps. We can solve both of these problems with a dash of infinity.

We'll start with area. We have to find an upper and a lower bound again.
We're going to divide the curve into 4 sections.

First, the upper bound. We find the highest value in each section and make a rectangle of that height. This approach is too greedy and overestimates.

The lower bound is similar. We find the smallest value in each interval and make rectangles of that height.
This underestimates and leaves areas uncovered.

If we do 7 divisions instead. We can see that the upper bound has decreased: there is less excess area. The lower bound has increased: the gaps are smaller and more area is covered.

With 10 divisions, it's even better. It seems the upper and lower bounds are approaching each other.

And the same at 13 divisions. If we keep doing this, our slices will get thinner and thinner, and we'll be adding more of them together. If we take a limit, each slice becomes infinitely thin, and there are infinitely many of them. Let's step back and see what that means.

Take for example the sequence of lower bounds.

Because every slice is equally wide, we can glue them together into a single rectangle per step.
Its width $ w $ is the thickness of a single slice, and its height $ h $ is the sum of the heights of the slices.

In the limit, this rectangle becomes both infinitely thin and infinitely tall. This is a tug of war between Zero and Infinity where at first sight, they both seem to win. That's a problem. Luckily, we're not interested in the rectangle itself, but rather its area.

We can change a rectangle's sides without changing its area. We multiply its width by one factor (e.g. $ 2 $), and divide the height by the same amount. The area $ 2w \cdot \frac{h}{2} $ is unchanged. Hence, we can normalize our rectangles to all have the same width, for example $ 1 $.

We can do the same for the upper bounds. We can see that both areas are converging on the same value. This is the true area under the curve, which is neither zero nor infinite. In this tug of war, both parties are equally matched.

Now our sequence looks very different: it's approaching a definite area, sandwiched between red and blue.

$ \class{slate}{60} $

If we take the limit, we get the area under our curve.

$ \class{orangered}{60} $

This way we can find the area under any smooth curve. This process is called integration. The symbol for integration is $ \int_a^b $ where $ a $ and $ b $ are the start and end points you're integrating between. The S-shape stands for our sum, adding up infinitely many pieces.

$$ \int_0^T \! f(t) \mathrm{d} t $$

We can then integrate one curve to make another, by sweeping out area horizontally from a fixed starting point. We move the end point to a time $ T $ and plot the accumulated value along the way. Using limits, we can do this continuously. This takes us from speed to distance travelled. The quantity $ \,\mathrm{d}t\, $ is the infinitesimal width of each slice, an infinitely small amount of time.

Now we just need to figure out the reverse and find slopes. We'll go back to our failed attempt from earlier.

If we shrink the distance we're considering, our slope estimate gets closer to the true value. But if we try to take a limit, we end up dividing $ 0 $ by $ 0 $.

Instead we need to normalize our sequence again so it doesn't vanish.

We only care about slope: the ratio of the two right sides. Which means, if we scale up each triangle, the ratio is unchanged. That just comes down to multiplying both sides by the same number. Again we can scale them all to the exact same width.

Now we've created a limit that does converge to something rather than nothing.

This finite value is the slope at the point we were homing in on. Because we can apply this process at any point on the curve, we can find the exact slope anywhere. This is called finding the derivative or differentiation.

$$ \frac{ \mathrm{d} f(t) }{\mathrm{d} t} $$

We can also apply this process over an entire curve to generate a new one. So now we know how to go the other way: distance to speed. Mathematically, we are dividing an infinitesimal piece of the distance, $ \,\mathrm{d} \class{slate}{f(t)}\, $, by an infinitesimal slice of time $ \,\mathrm{d} t\, $. Working with infinitesimal formulas is tricky however. There's always an implied limit being taken in order to reach them in the first place. Indeed, it took centuries to formalize this fuzzy explanation into what we call differential forms today.

We can note that if we shift the distance curve up or down, the speed is unchanged. When you take a derivative, any constant value you've added to your function simply disappears. This shows again that speed is always in the moment, it only describes what's going on in an infinitely short piece of curve.

Differentiation is then like x-ray specs for curves and quantities, and it's turtles all the way down. For example, if we differentiate speed, we get acceleration. This is another rate of change, of speed over time. We see the car's acceleration is initially positive, speeding up, and then goes negative, to slow down, i.e. accelerate in the opposite direction.
Note: The acceleration has been divided by 4 to fit.

If we integrate acceleration to get speed, we have to count the second part as negative area: it is causing the speed to decrease.

We can see that the point of maximum speed is the point where the acceleration passes through $ 0 $. One of the most useful applications of derivatives is indeed to find a maximum or minimum of a curve more easily. No matter where it is, the slope at such a point must always be horizontal—provided the curve is smooth.

Let's end this with a more exciting example. What's tall, fast and makes kids scream?

A roller coaster! We'll construct a little track by welding together pieces of circles and lines.

Alas, we shouldn't be too proud of our creation. Even though it looks smooth, there's something very wrong. This is how you build roller coasters when you don't want people to have fun. To see the problem, we need to use our x-ray specs.

$$ \class{orangered}{f^{\prime}(x)} = \frac{\mathrm{d}\class{slate}{f(x)}}{\mathrm{d}x} $$

We differentiate the height into its slope. It has sharp corners all over the place. Even though the track itself looks smooth, it doesn't change smoothly. The slope is constant in the straight sections and changes rapidly in the curved sections.

$$ \class{green}{f^{\prime\prime}(x)} = \frac{\mathrm{d^2}\class{slate}{f(x)}}{\mathrm{d}x^2} $$

If we take the derivative of the slope, i.e. find the slope's slope, we get a measure of curvature. It's positive inside valleys, negative on top of crests. This graph is even worse: there are sharp peaks and cliffs. Note that in the formula, we are now dividing by the square of the infinitesimal distance $ \mathrm{d}x $. This is like going two levels down on the hyperreal number line and back up again.

$$ \class{teal}{κ(x)} = \frac{1}{ρ} = \frac{ \class{green}{f^{\prime\prime}(x)} } { (1 + \class{orangered}{f^{\prime}(x)}^2)^{3/2} } $$

We can see better if we replace the second derivative with the 2D curvature.
This is the radius of the circle that touches the curve at a given point. As this radius gets infinitely big on straight sections, we use its inverse, $ \class{teal}{κ} $. Because of how we built the track, $ κ $ switches between $ 0 $ and a constant positive or negative value.
At every switch, there will be a corresponding change in force, a jerk.

Let's simulate a ride. As riders go through our curved sections, their inertia will push them to the outside of the curve. From their point of view, this is a centrifugal force up or down. We'll plot the (subjective) vertical G force including gravity. It starts at a comfy 1 G, but then swings wildly between 0.5 G and 1.25 G.

Even though the track seems smooth, we can see that the vertical G's are not. Every time we enter a curve, we experience a sudden jerk up or down. This is due to the jumps in the curvature. The G's are themselves curved, because the rider's sense of gravity decreases as the cart goes vertical. The sharp dips below 0.5 G are not simulation errors: this is actually what it would feel like.

To really highlight the problem, we need to x-ray the G's and derive again. G forces are a form of acceleration. The derivative of acceleration is a change in force, called jerk. Whenever it's non-zero, you feel jerked in a particular direction.

To fix this, we need to alter the curve of the track and smooth it out at all the different levels of differentiation. Here I've applied a relaxation procedure. It's like a blur filter in photoshop: we replace every point on the track with the average of its neighbours. We get a subtly different curve. Its height hasn't changed much at all, it's just a little bit less tense.

But this minor change has a huge effect on both slope and radius of curvature. They are completely smoothed out, with all corners and jumps removed.

If we do another simulation, the G force graph looks completely different. There are no more jumps.

But the real difference is in jerk. There are no more actual jerks, only smooth oscillations. Instead of bruises, riders will get butterflies. Thanks to calculus, we avoided that painful lesson without ever having to ride it ourselves.

Please check your pockets for loose items. Lost property will not be returned.

Let's start with the original, unrelaxed track. Thanks to calculus, we can simulate head-bobbing so you can get a feel for how jerky this is. Even virtually, this isn't very pleasant.

This is the improved track. Notice the smooth transitions in and out of curves.

And that's how you make sweet roller coasters: by building them out of infinitely small, smooth pieces, so you don't get jerked around too much.

That was differential and integral calculus in a nutshell. We saw how many people actually spend hours every day sitting in front of an integrator: the odometers in their cars, which integrate speed into distance. And the derivative of speed is acceleration—i.e. how hard you're pushing on the gas pedal or brake, combined with forces like drag and friction.

By using these tools in equations, we can describe laws that relate quantities to their rates of change. Drag, also known as air resistance, is a force which gets stronger the faster you go. This is a relationship between the first and second derivatives of position.

In fact, the relaxation procedure we applied to our track is equivalent to another physical phenomenon. If the curve of the coaster represented the temperature along a thin metal rod, then the heat would start to equalize itself in exactly that fashion. Temperature wants to be smooth, eventually averaging out completely into a flat curve.

Whether it's heat distribution, fluid dynamics, wave propagation or a head bobbing in a roller coaster, all of these problems can be naturally expressed as so called differential equations. Solving them is a skill learned over many years, and some solutions come in the form of infinite series. Again, infinity shows up, ever the uninvited guest at the dinner table.

Closing Thoughts

Infinity is a many splendored thing but it does not lift us up where we belong. It boggles our mind with its implications, yet is absolutely essential in math, engineering and science. It grants us the ability to see the impossible and build new ideas within it. That way, we can solve intractable problems and understand the world better.

What a shame then that in pop culture, it only lives as a caricature. Conversations about infinity occupy a certain sphere of it—Pink Floyd has been playing on repeat, and there's usually someone peddling crystals and incense nearby.
"Man, have you ever, like, tried to imagine infinity…?" they mumble, staring off into the distance.

"Funny story, actually. We just came from there…"

Comments, feedback and corrections are welcome on Google Plus. Diagrams powered by MathBox.

More like this: How to Fold a Julia Fractal.

Hackery, Math & Design

Steven Wittens i

To Infinity… And Beyond!

Exploring the outer limits

The Shortest Disappearing Trick Ever

Achilles and the Tortoise

Breaking Away From Rationality

Drowning By Numbers

A Tug of War Between the Gods

Closing Thoughts