Tools for Thought

Graphical Algebra and Fourier Analysis

"**I think a lot of mathematics is really about how you understand things in your head.** […]

It may seem unimportant, but when I started out […] People drew pictures of surfaces one way and I started drawing them a different way. **There's something significant about how the representation in your head profoundly changes how you think.**

It's very hard to do a brain dump. … Words are one thing, […] There are many precise mathematical words that could be used, but they don't automatically convey a feeling for it. **I probably can't convey a feeling for it either, but I want to try.**"

- William Thurston, The Mystery of 3-Manifolds

$$ f(x) = 2^x $$

$$ 2 \cdot 2^x = 2^{1 + x} $$

$$ 2^a \cdot 2^b = 2^{a + b} $$

$$ (2^x)^2 = 2^{2x} $$

$$ (2^a)^b = 2^{a \cdot b} $$

$$ y = 2^x \Leftrightarrow x = log_2 y $$

$$ f(x) = x^2 $$

$$ x^2 + 2x = (x + 1)^2 - 1 $$

$$ x^2 + \frac{b}{a}x + \frac{c}{a} = (x + \frac{b}{2a})^2 + C $$

$$ f(x) = \cos x $$

$$ \frac{1 + \cos 2x}{2} = \cos^2 x $$

71,592 triangles — 214,776 vertices

833,472 triangles — 2,500,416 vertices

$$ \color{#409020}{X_k}\ = \sum_{n=0}^{N-1} \color{#3090FF}{x_n} \cdot e^{-2 \pi i k n / N} $$

Presented at the *Tools for Thought* workshop, Recurse Center, NYC 2016

Hi, I'm Steven. I usually start with my website, aka *that site with that header*, as my defacto calling card. This effect is powered by WebGL, and consists of live geometry generated in JavaScript and streamed into the GPU. This way I can feed large amounts of data in very efficiently, in this case about 45,000 triangles.

The site's header is just a fun piece of live art, but really, I'm more interested in using this power for communication purposes. In particular, for a few years now, I've worked on visualizing mathematical concepts directly, with minimal fuss or fluff. William Thurston was a mathematician who did similar things, trying to illuminate difficult concepts.

In his case, it was 3-manifolds, aka *4 dimensional hypersurfaces*. I found this quote in one of his lectures, and it really resonated with me: if words are poor tools for doing a brain dump, maybe we should try to explain things without words altogether.

I've been coding ever since I could reach a keyboard. By playing around with computer graphics, I realized I could reason very intuitively about mathematics, by translating formulas to pictures. I built up a visual understanding of common topics, and found it much easier to remember and think about than formulas. For lack of a better term, let's call that *Graphical Algebra*.

(let's compile some WebGL shaders)

Take for example an exponential curve.

It has some really surprising properties.

On the left, we stretch the curve vertically by a factor of 2.

On the right, I simply shift it left by one unit.

Both curves are the same. We can express this symbolically, but this representation is not as satisfying to me.

It's a specific case of the formula for *adding up* exponents. We can memorize the formula, but I find it easier to remember it like this: *vertical scaling is equivalent to horizontal translation*. It's an inherent *freaky property* of the exponential curve.

It's not a particularly unique example either.

For example, when we *square* the values on the left vertically. Unlike before, this is a highly *non-linear* transformation. It seems like that would be much harder to map cleanly onto something else.

But no, all we do is scale the graph on the right *horizontally* by a factor of 2.

Once again, both curves are actually the same, and we can express this symbolically.

The general case is the formula for *multiplying* exponents: *vertical powers are equivalent to horizontal scaling*. Once again, it's an inherent property of the exponential curve. What's more, because (2^{2})^{x} = 4^{x}, it implies that really, there is only *one exponential curve to rule them all*. Every exponential curve of any base or offset can be drawn by scaling/ translating any other exponential curve, vertically/horizontally. I don't know about you, but this was never explained to me in school. I find this a very neat way to understand exactly what makes exponentials so special.

In fact, because the logarithmic function is the inverse of the exponential, all the same things apply there, and the *one true logarithmic curve* is the exact same one too. This is why powers of x turn into straight lines on a logarithmic plot.

But this sort of insight doesn't just apply to exponentials. Take for example a quadratic.

Once again we can play around and discover something neat.

On the left, we skew the parabola, i.e. applying an *affine* transformation *y ← y + 2x*.

On the right, we move the parabola down and to the left by one unit each.

Once again both curves are the same. When we *skew* a parabola, we're really *rolling* its center to the left or right, with an additional vertical shift applied. There is only *one true parabola*, which we can only translate and scale.

In the general case, this turns into the quadratic formula, where students are told to memorize the formula for the *discriminant*, often asked to solve a bunch of context-less quadratics one after the other. I find it much more important to remember that, in the standard form, the parabola's center is always at the point x = -b/2a. The rest can be derived as necessary, having been reduced to the form (…)^{2} + Constant.

You can apply the same sort of thinking to more complicated situations rather than just elemental functions. For example, the cosine.

First, let's take the absolute value of the cosine, making all negative values positive. This effectively folds the bottom half onto the top.

Now, let's square the values again. As the cosine ranges from -1 to 1, the values get compressed down towards zero non-linearly. We end up with a cosine of twice the frequency, scaled and translated vertically.

This is commonly expressed as a trigonometric angle doubling formula, and students are sometimes asked to memorize a bunch of them. But I find the graphical representation both more satisfying and more useful. For example, each of the minima used to be a line of absolute slope 1, which was then squared. Also, a cosine has symmetric ups and downs. It tells us that a cosine (or sine) is essentially an approximate sequence of parabolas, alternating upside down. This tells you about the Taylor series expansion, and is not that surprising if you know that sine and cosine are solutions to a second order differential equation.

So what does it look like to apply this sort of graphical thinking and direct visualization to more complicated math, like Fourier Analysis? This is the art of finding particular *frequencies* in waves.**Note: This part requires Web Audio API and only works in Chrome or Firefox.**

(this one takes a little longer to compile... bear with me, you'll see why... if you're on a laptop, crotch beware, things are about to get toasty)

I'm going to use live audio as the input. This is MP3 internet radio streamed from fm.acko.net. Or as I like to call it, a 24/7 rave party on the internet, complete with trippy visuals.

This is the sound wave visualized, 23 milliseconds at a time, i.e. the sound pressure emitted from the speakers. Click the button in the corner to toggle to microphone input instead.

If we find the average value of a (section of) wave, we get the "DC" value (direct current). The average level of a sound wave is actually zero, but the deviations here are due to the finite window we're sampling from, in this case, any frequency below 43Hz = 1/0.023s. If you're using tinny laptop speakers, you can't hear any of those by the way.

In Fourier Analysis, we want to take 2D averages though. This is typically done using complex numbers, which are 2D vectors that like to turn.

With complex numbers, it is trivial to *twist* the graph. In doing so, we tune the average to be biased towards a particular frequency. If the wave contains that frequency, then the average will be biased away from the center..

If we increase the twist, we tune the analyzer to a higher frequency.

The magnitude of the average tells us the *amplitude*, the angle of the average tells us the *phase*, i.e. the particular shift of the wave over time. That is, really, the *scaling and translation* of the frequency.

Let's twist it by a total factor of 16. It's now tuned to frequencies of about 650 Hz, roughly 5x the pitch of a human voice.

The diagram is now extremely twisted. As GPUs like to draw straight lines and triangles, I need to divide each curve very finely in order to get the appearance of smooth curves, for a total of 214k vertices and 71k triangles. That's quite a lot.

In fact, instead of just one frequency at a time, let's visualize the entire frequency axis at the same time, from DC up to here. Complexity only goes up. This is really the lesson I learned from building intricate explanatory visualizations: sometimes, in order to create an accurate and intuitive picture of a concept, you need to add an order or two of magnitude. Consider it a graphical version of the Curse of Multidimensionality.

So here it is, a live and (mostly) complete picture of the Discrete Fourier Transform, aka the DFT. In practice, we tend to cheat and use the Fast Fourier Transform (FFT) instead, but conceptually, this is what is happening underneath: we're taking the average of increasingly twisted versions of the same wave, creating a whole new dimension, while collapsing the old one by summation. And if you've really been paying attention, it should be no surprise that there's an *exponential* function in the formula, connecting *powers and twisting* with *scaling and translation*. See if you can identify which frequency bands correspond to which sounds, and enjoy! (Source code)