mathr / blog / #

Deep zoom theory and practice

Introduction

The complex beauty of the world's most famous fractal, the Mandelbrot set, emerges from the repeated iteration of a simple formula:

\[z \to z^2 + c\]

Zooming into the intricate boundary of the shape reveals ever more detail, but one needs higher precision numbers and higher iteration counts as you go deeper. The computational cost rises quickly with the classical rendering algorithms which use high precision numbers for each pixel.

In 2013, K.I. Martin's SuperFractalThing and accompanying white paper sft_maths.pdf popularized a pair of new acceleration techniques. First one notes that the formula \(z \to z^2 + c\) is continuous, so nearby points remain nearby under iteration. This means you can iterate one point at high precision (the reference orbit) and compute differences from the reference orbit for each pixel in low precision (the perturbed orbits). Secondly, iterating the perturbed formula one ends up with a polynomial series in the initial pertubation in \(c\), which depends only on the reference. The degree rises rapidly but you can truncate it to get an approximation. This means you can compute the series approximation coefficients once, and substitute in the perturbed \(c\) values for each pixel, allowing you to initialize the perturbed orbits at a later iteration, skipping potentially lots of per-pixel work.

The perturbation technique has since been extended to the Burning Ship fractal and other "abs variations", and it also works for hybrid fractals combining iterations of several formulas.

Prerequisites for the rest of this article: a familiarity with complex numbers and algebraic manipulation; knowing how to draw the unzoomed Mandelbrot set; understanding the limitations of computer implementation of numbers (see for example Hardware Floating Point Types).

Perturbation

In the remainder of this post, lower case and upper case variables with the same letter mean different things. Upper case means unperturbed or reference, usually high precision or high range. Lower case means perturbed per pixel delta, low precision and low range.

In perturbation, on starts with the iteration formula [1]:

\[Z \to Z^2 + C\]

Perturb the variables with unevaluated sums [2]:

\[(Z + z) \to (Z + z)^2 + (C + c)\]

Do symbolic algebra to avoid the catastrophic absorption when adding tiny values \(z\) to large values \(Z\) (e.g. 1 million plus 1 is still 1 million if you only have 3 significant digits to work with) [3]:

\[z \to 2 Z z + z^2 + c\]

\(C, Z\) is the "reference" orbit, computed in high precision using [1] and rounded to machine double precision, which works fine most of the time. \(c, z\) are the "pixel" orbit, you can do many of these near each reference (e.g. an entire image).

Glitches

There is a problem that can be noticed when you zoom deeper near certain features in the fractal. There are parts that can have a "noisy" appearance, or there may be weird flat blobs that look out of place. These are the infamous perturbation glitches. It was observed that adding references in the glitches and recomputing the pixels could fix them, but there was no reliable way to detect them programmatically until Pauldelbrot discovered/invented a method: Perturbation Theory Glitches Improvement.

The solution: if [4]:

\[|Z+z| << |Z|\]

at any iteration, then glitches can occur. The solution: retry with a new reference, or (for well-behaved formulas like the Mandelbrot set) rebase to a new reference and carry on.

Perturbation assumes exact maths, but some images have glitches when naively using perturbation in low precision. Pauldelbrot found his glitch criterion by perturbing the perturbation iterations: one has perturbed iteration as in [3] (recap: \(z \to 2 Z z + z^2 + c\)). Then one perturbs this with \(z \to z + e, c \to c + f\) [5]:

\[e \to (2 (Z + z) + e) e + f\]

We are interested what happens to the ratio \(e/z\) under iteration, so rewrite [3] as [6]:

\[z \to (2 Z + z) z + c\]

Pattern matching, the interesting part (assuming \(c\) and \(f\) are small) of \(e/z\) is \(2(Z + z) / 2 Z\). When \(e/z\) is small, the nearby pixels "stick together" and there is not enough precision in the number type to distinguish them, which makes a glitch. So a glitch can be detected when [7]:

\[|Z + z|^2 < G |Z|^2\]

where G is a threshold (somewhere between 1e-2 and 1e-8, depending how strict you want to be). This does not add much cost, as \(|Z+z|^2\) already needs to be computed for escape test, and \(G|Z^2|\) can be computed once for each iteration of the reference orbit and stored.

The problem now is: How to choose G? Too big and it takes forever as glitches are detected all over, too small and some glitches can be missed leading to bad images.

The glitched pixels can be recalculated with a more appropriate reference point: more glitches may result and adding more references may be necessary until the image is finished.

Rescaling

Double precision floating point (with 53 bits of mantissa) is more than enough for computing perturbed orbits: even single precision (with 24 bits) can be used successfully. But when zooming deeper another problem occurs: double precision has a limited range, once values get smaller than about 1e-308 then they underflow to 0. This means perturbation with double precision can only zoom so far, as eventually the perturbed deltas are smaller than can be represented.

An early technique for extending range is to store the mantissa as a double precision value, but normalized to be near 1 in magnitude, with a separate integer to store the exponent. This floatexp technique works for arbitrarily deep zooms, but the performance is terrible because it needs to handle every arithmetic operation in software (instead of them being a single CPU instruction).

The solution for efficient performance turned out to be using an unevaluated product (compare with the unevaluated sum of perturbation) to rescale the double precision iterations to be nearer 1 and avoid underflow: substitute \(z = S w\) and \(c = S d\) to get [8]:

\[S w \to 2 Z S w + S^2 w^2 + S d\]

and now cancel out one scale factor \(S\) throughout [9]:

\[w \to 2 Z w + S w^2 + d\]

Choose \(S\) so that \(|w|\) is around \(1\). When \(|w|\) is at risk of overflow (or underflow) after some iterations, redo the scaling; this is typically a few hundred iterations as \(|Z|\) is bounded by \(2\) except at final escape.

Optimization: if \(S\) underflowed to \(0\) in double precision, you don't need to calculate the \(+ S w^2\) term at all when \(Z\) is not small. Similarly you can skip the \(+ d\) if it underflowed. For higher powers there will be terms involving \(S^2 w^3\) (for example), which might not need to be calculated either due to underflow. Ideally these tests would be performed once at rescaling time, instead of in every inner loop iteration (though they would be highly predictable I suppose).

Full Iterations

There is a problem: if \(|Z|\) is very small, it can underflow to \(0\) in unscaled double in [9]. One needs to store the full range \(Z\) and do a full range (e.g. floatexp) iteration at those points, because \(|w|\) can change dramatically. Rescaling is necessary afterwards. This was described by Pauldelbrot: Rescaled Iterations in Nanoscope.

To do the full iteration, compute \(z = S w\) in floatexp (using a floatexp for \(S\) so that there is no underflow), do the perturbed iteration [3] with all variables in floatexp. To rescale afterwards, compute \(S = |z|\) and \(w = z/S, d = c/S\) (computed in floatexp with \(w\) and \(d\) rounded to double precision afterwards). Then a double precision \(s\) can be computed for use in [9].

Abs Variations

The Burning Ship fractal modifies the Mandelbrot set formula by taking absolute values of the real and imaginary parts before the complex squaring [10]:

\[X + i Y \to (|X| + i |Y|)^2 + C\]

When perturbing the Burning Ship and other "abs variations", one ends up with things like [11]:

\[|XY + Xy + xY + xy| - |XY|\]

which naively gives \(0\) by catastrophic absorption and cancellation. laser blaster made a case analysis Perturbation Formula for Burning Ship which can be written as [12]:

diffabs(c, d) := |c+d| - |c| = c >= 0 ? c + d >= 0 ? d : -(2*c+d) : c + d > 0 ? 2*c+d : -d

when \(d\) is small the \(\pm d\) cases are much more likely. With rescaling in the mix [11] works out as [13]:

\[\operatorname{diffabs}(XY/s, Xy + xY + sxy)\]

which has the risk of overflow when \(s\) is small, but the signs work out ok even for infinite \(c\) as \(d\) is known to be finite. Moreover, if \(s = 0\) due to underflow, the \(\pm d\) branches will always be taken (except when \(XY\) is small, when a full floatexp iteration will be performed instead), and as \(s \ge 0\) by construction, [13] reduces to [14]:

\[\operatorname{sign}(X) * \operatorname{sign}(Y) * (X y + x Y)\]

(Note: this formulation helps avoid underflow in \(\operatorname{sign}(XY)\) when \(X\) and \(Y\) are small.)

Deep Needle

For well-behaved functions like the Mandelbrot set iterations, one needs to do full iterations when \(Z\) gets small. For the Burning Ship and other abs variations, this is not sufficient: problems occur if either X and Y are small, not only when both are small at the same time. Full iterations need to be done when either variable is small. This makes rescaled iterations for locations near the needle slower than just doing full floatexp iterations all the time (because of the extra wasted work handling the rescaling). This is because near the needle all the iterations have Y near 0, which means floatexp iterations will be done anyway. Using floatexp from the get go avoids many branches and rescaling in the inner loop, so it's significantly faster. The problem is worse in single precision because it has much less range: it underflows below 1e-38 or so, rather than 1e-308 for double precision.

The problem of automatically detecting these "deep needle" locations (which may be in the needles of miniships) and switching implementations to avoid the extra slowdown remains unresolved in KF.

Hybrid Fractals

The Mandelbrot set has lovely logarithmic spirals all over, and the Burning Ship has interesting "rigging" on the miniships on its needle. Hybridization provide a way to get both these features in a single fractal image. The basic idea is to interleave the iteration formulas, for example alternating between [1] and [10], but more complicated interleavings are possible (eg [1][10][1][1] in a loop, etc).

Hybrid fractals in KF are built from stanzas, each has some lines, each line has two operators, and each operator has controls for absolute x, absolute y, negate x, negate y, integer power \(p\), complex multiplier \(a\). The two operators in a line can be combined by addition, subtraction or multiplication, and currently the number of lines in a stanza can be either 1 or 2 and there can be 1, 2, 3 or 4 stanzas. The output of each line is fed into the next, and at the end of each stanza the +c part of the formula happens. There are controls to choose how many times to repeat each stanza, and which stanza to continue from after reaching the end.

Implementing perturbation for this is quite methodical. Start from an operator, with inputs \(Z\) and \(z\). Set mutable variables:

z := input
W := Z + z
B := Z

If absolute x enabled in formula, then update

re(z) := diffabs(re(Z), re(z))
re(W) := abs(W)
re(B) := abs(B)

Similarly for the imaginary part. If negate x enabled in formula, then update

re(z) := -re(z)
W := -W
B := -B

Similarly for the imaginary part. Now compute

\[S = \sum_{i=0}^{p-1} W^i B^{p-1 - i}\]

and return \(a z S\). Combining operators into lines may be done by Perturbation Algebra. Combining lines into stanzas can be done by iterating unperturbed \(Z\) alongside perturbed \(z\); only the \(+C\) needs high precision, and that is not done within a stanza.

Rescaling hybrid iterations seems like a big challenge, but it's not that hard: if either or both the real and imaginary parts of the reference orbit \(Z\) are small, one needs to do a full range iteration with floatexp and recalculate the scale factor afterwards, as with formulas like Burning Ship. Otherwise, thread \(s\) through from the top level down to the operators. Initialize with

W := Z + z*s

and modify the absolute cases to divide the reference by \(s\):

re(z) := diffabs(re(Z/s), re(z))

Similarly for imaginary part. When combining operators (this subterm only occurs with multiplication) replace \(f(o_1, Z + z)\) with \(f(o_1, Z + z s)\).

And that's almost all the changes that need to be made!

For distance estimation of hybrid formulas I use dual numbers for automatic differentiation. One small adjustment was needed for it to work with rescaled iterations: instead of initializing the dual parts (before iteration) with 1 and scaling by the pixel spacing at the end for screen-space colouring, initialize the dual parts with the pixel spacing and don't scale at the end. This avoids overflow of the derivative, and the same rescaling factor can be used for regular and dual parts.

Code Generation

Naive implementations of parametric hybrids are very slow due to all the branches in the inner loops (checking if absolute x enabled at every iteration for every pixel, etc). Using for example OpenCL, these branches can be done once when generating source code for a formula, instead of every iteration for every pixel. This runs much faster, even when compiled to run on the same OpenCL device that is interpreting the parametric code.

Series Approximation

The other part of the thing that K I Martin's SuperFractalThing popularized was that iteration of [3] gives a polynomial series in \(c\) [15]:

\[z_n = \sum A_{n,k} c^k\]

(with 0 constant term). This can be used to "skip" a whole bunch of iterations, assuming that truncating the series and/or low precision doesn't cause too much trouble. Substituting [15] into [3] gives [16]:

\[\sum A_{n+1,k} c^k = 2 Z \sum A_{n,k} c^k + (\sum A_{n,k} c^k)^2 + c\]

Equating coefficients of \(c^k\) gives recurrence relations for the series coefficients \(A_{n,k}\). See Simpler Series Approximation.

The traditional way to evaluate that it's ok to do the series approximation at an iteration is to check whether it doesn't deviate too far from regular iterations (or perturbed iterations) at a collection of "probe" points. When it starts to deviate, roll back an iteration and initialize all the image pixels with [15] at that iteration.

Later, knighty extended the series approximation to two complex variables. If the reference \(C\) is a periodic point (for example the center of a minibrot), the biseries in \(z, c\) allows skipping a whole period of iterations. Then multiple periods can be skipped by repeating the biseries step. This gives a further big speedup beyond regular series approximation near minibrots. An escape radius is needed for \(z\), based on properties of the reference, so as not to perform too many biseries iterations. After that, regular perturbed iterations are performed until final escape. This is available in KF as NanoMB1.

Current research by knighty and others involves a chain of minibrots at successively deeper zoom levels. One starts with the deepest minibrot, performing biseries iterations until it escapes its \(z\) radius. Then rebase the iterates to the next outer minibrot, and perform biseries iterations with that. Repeat until final escape. This is available in KF as NanoMB2, but it's highly experimental and fails for many locations. Perhaps it needs to be combined with more perturbation or higher precision: sometimes the iterates may still be too close to each other when they escape a deep minibrot, such that catastrophic absorption occurs. In progress...

For Burning Ship and other abs variations (and presumably hybrids too), series approximation can take the form of two bivariate real series in \(\Re(c)\) and \(\Im(c)\) for the real and imaginary parts of \(z\). But these are only good so long as the region is not folded by an absolute value, so typically only a few iterations can be skipped. Maybe the series can be split into two (or more) parts with the other side(s) shifted when this occurs? In progress...

Conclusion

Perturbation techniques that greatly reduce the quantity of high precision iterations needed, as well as (for well-behaved formulas) series approximation techniques that reduce the quantity of low precision iterations needed still further, provide a vast speedup over classical algorithms that use high precision for every pixel. Rescaling can provide an additional constant factor speedup over using full range floatexp number types for most (not "deep needle") locations. Chained biseries approximation ("NanoMB2") and series approximation for abs variations and hybrids are still topics of research.

It remains open how to choose the \(G\) for Pauldelbrot's glitch detection criterion, and how to robustly compute series approximation skipping: there is still no complete mathematical proof of correctness with rigourous error bounds, although the images do most often look plausible and different implementations do tend to agree.