I benchmarked some Mandelbrot set renderers. Click pictures for bigger versions. The corresponding deep zoom images are these:

Location credits:

Olbaid-ST Deep Mandelbrot Zoom 023 |
Dinkydau Evolution Of Trees |
Dinkydau Ssssssssss |

(self-made) | (self-made) | (self-made) |

Traditional deep zoom rendering with arbitrary precision calculations has a constant cost per pixel. This is visiable on the graph as horizontal lines for the renderer MDZ (Mandelbrot Deep Zoom), version 0.1.3 is an unreleased version with some small changes I made to allow the benchmarks to be run from a shell script. On the graph bottom right you can see native machine precision in MDZ is much faster than any perturbation renderer.

Perturbation techniques allow native precision to be used for the bulk of the calculations, as deltas from an arbitrary precision reference. Series approximation techniques allow many per-pixel iterations to be skipped entirely. There is some per-image overhead, but the eventual per-pixel cost is lower when the image size increases. This is visible on the graph as downward sloping lines for the renderer Kalles Fraktaler, version 2.12.5 will be released next month with some bug fixes and command line rendering support, and the renderer mandelbrot-perturbator, which is still highly experimental.

Each of these renderers has two lines, with different thresholds for Pauldelbrot's glitch detection heuristic. One conclusion to be drawn is that the threshold has minimal impact on mandelbrot-perturbator render times, while the higher threshold (necessary for correctness with some locations) can slow Kalles Fraktaler down by a significant amount. Kalles Fraktaler is typically slower than mandelbrot-perturbator with this more accurate mode.

A final conclusion is that while mandelbrot-perturbator flattens out to a constant low cost per pixel as the image size increases, at some locations Kalles Fraktaler starts to slow down further (the lines slope upwards). This indicates some performance bug that I hope to investigate at some point next month. In the mean time tiled rendering might be cheaper.

These benchmarks represent 20 days of (single core) CPU time on a quad core AMD Athlon II X4 640 Processor underclocked to 2.3GHz due to thermal issues. The benchmark data is in the Kalles Fraktaler 2 source code repository.

]]>

A black-and-white A5 paperback with 100 pages of Turing patterns, hand-selected from 1000s of candidate images generated by a multi-layer reaction-diffusion biochemistry system simulated on digital computer as a coupled cellular automaton.

Click the picture above for lots more information, including photos, and links to print-on-demand and source code too (a fork of my cca project).

I generated the page images as 100dpi bitmaps, then vectorized with potrace. The printed copy is really nice, smooth white paper good for colouring and a bright glossy cover. One small issue is that it's hard to get pencils right into the perfect binding, but I expected that before I had it made.

This project is what my GPU temperature throttle was for, though I'm sure it'll come in handy for other things as well.

]]>A live-coded bytebeat music session inspired by approaching autumn:

A bit stroboscopic, be careful if that affects you. See Falling Leaves on archive.org or download:

MKV (164MB) | MP4 (61MB) | OGV (46MB) |

Falling Leaves audio code log, plus video rendering with mplayer and ffmpeg:

mplayer -benchmark -nosound -demuxer rawvideo \ -rawvideo fps=31.25:w=16:h=16:y8 -vo pnm falling-leaves.1u8 ffmpeg -i falling-leaves.flac -framerate 31.25 -i "%08d.ppm" -sws_flags neighbor \ -filter:v "scale=w=1024:h=1024, pad=w=1920:h=1080:x=448:y=28:color=0x808080" \ -pix_fmt yuv420p -crf:v 0 -codec:a copy falling-leaves.mkv

(I uploaded this a couple of weeks ago but didn't get around to blogging about it until today.)

]]>I last played live with GULCII in 2011 at LiWoLi and NOHUP Bow Arts Open around the same time. With further tweaks and enhancements, I've been rehearsing a short code recital for the FARM conference's Evening Of Algorithmic Arts at the Old Fire Station in Oxford, UK, this coming Saturday 9th September. Book your free ticket online and come along, it should be a good one with many performances.

GULCII (Graphical Untyped Lambda Calculus Interactive Interpreter) is an untyped lambda calculus interpreter supporting interactive modification of a running program with graphical display of graph reduction. Lambda calculus is a minimal prototypical functional programming language developed by Alonzo Church in the 1930s. Church encoding uses folds to represent data as higher-order functions. Dana Scott's encoding composes algebraic data types as functions. Each has strengths and weaknesses.

The performance is a code recital, with the internal state of the interpreter visualized and sonified. Part 1 introduces Church encoding. Part 2 develops Scott encoding. An interlude takes in two non-terminating loops, each with their own intrinsic computational rhythm. Finally, Part 3 tests an equivalence between Church and Scott numerals.

The performance involves functional programming in two ways. Firstly, the live recital of lambda calculus code into the interpreter for evaluation. Secondly, the interpreter itself is written in the functional programming language Haskell.

Regarding related work, Henning Thielemann's
"live-sequencer"
has a similar approach to code evaluation with names rebindable to terms during
runtime. Sonification of data has a long history, including a mention by Douglas
Adams in "*Dirk Gently’s Holistic Detective Agency*" (1987), while
sonification of machines is perhaps more niche. I was inspired by Valentina
Vuksic's
"Sei Personaggi Part 2",
in which "*the magnetic fields of the memory modules (RAM) are being
picked up acoustically while the experimental computer drama is moving on*";
the Puredyne GNU/Linux startup sound, "`cat /dev/mem > /dev/dsp`

";
and also Dave Griffith's "Betablocker"
live-coding acid techno machine.

I'll be at the daytime FARM conference too, if you're at ICFP and want to say hi. Long beard, glasses, probably drinking coffee.

**EDIT:** it went really well, I'm pleased. Had some nice
feedback too, particularly a suggestion that I could have finished up by
defining:

churchPred = \n . toChurch (scottPred (fromChurch n))

I didn't catch the name of the suggester, but I think I'll use this in future, it's a very nice punchline. Thanks, mystery functional programmer! I'll try to improve the sonfication too, it is rather minimal.

Since getting back home I worked a bit on the code, cleaning it up for a 0.3 release on Hackage (not quite ready yet) - I also fixed an infelicity with the evaluator, now it reduces the right-hand branch of an apply node when the left-hand branch is irreducible (and the apply node itself is not a lambda applied to something, in which case it would beta reduce). This means it can make progress in more cases, for example (with Scott-encoded lists and numerals):

iterate succ zero

The version I performed with would get stuck at
`cons zero `

.*stuff*

**tanh()** is a nice function for saturation in audio, amongst
other applications. Near 0 it is similar to the identity function \(x\), leaving
sound unchanged. Far from 0 it tends to constant \(\pm 1\), with a smooth roll-off
curve that gives a soft-clipping effect, where louder signals are more distorted,
and the output never exceeds 0dBFS.

Unfortunately tanh() is computationally expensive, so approximations are desirable. One common approximation is a rational function:

\[ \tanh(x) \approx x \frac{27 + x^2}{27 + 9 x^2} \]

which the apparent source describes as

based on the pade-approximation of the tanh function with tweaked coefficients.

I also found mention Padé-approximants on a page primarily focussed on approximating tanh() with a continued fraction. So I set out to discover what they were. Some dead ends (a book appendix with 6 pages of badly-OCR'd FORTRAN code, to name but one) and I eventually struck gold with this PDF preview:

Outlines of Padé Approximation by Claude Brezinski in

Computational Aspects of Complex Analysis pp 1-50part of theNATO Advanced Study Institutes Series book series (ASIC, volume 102)Abstract: In the first section Padé approximants are defined and some properties are given. The second section deals with their algebraic properties and, in particular, with their connection to orthogonal polynomials and quadrature methods. Recursive schemes for computing sequences of Padé approximants are derived. The third section is devoted to some results concerning the convergence problem. Some applications and extensions are given in the last section.

The preview contains a system of equations involving a rational function:

\[ [p/q]_f := \frac{\sum_{i=0}^p a_i x^i}{\sum_{i=0}^q b_i x^i} \]

where without loss of generality \(b_0 = 1\). The coefficients are derived from the coefficients \(c_i\) of the Maclaurin series of \(f\) (which is its Taylor series expanded about \(0\)):

\[ f := \sum_{i=0}^\infty c_i x^i \]

The system of equations is very regular, here's some Haskell code to solve for the \(a\)s and \(b\)s given the \(c\)s:

pade :: [Rational] -> Int -> Int -> ([Rational], [Rational]) pade c p q = (a, b) where m = take q . map (take q) . tails . drop (p - q + 1) $ c v = take q . drop (p + 1) . map negate $ c Just m1 = inverse m -- FIXME this will crash if it isn't invertible b = 1 : reverse (m1 `mulmv` v) a = take (p + 1) $ [ reverse cs `dot` bs | (bs, cs) <- map unzip . drop 1 . inits $ zip (b ++ repeat 0) c ]

This needs an `inverse :: [[Rational]] -> Maybe [[Rational]]`

, the
pure Gaussian elimination code from
an old mailing list post
works fine with minor modifications for current GHC versions (it needs an extra
Eq constraint now).

With some extra output beautification code (and of course the coefficients of the power series for tanh(), which involve the Bernoulli numbers) the first few Padé approximants for tanh() are:

[1/0] = x [1/2] = x*(3)/(3+x**2) [3/2] = x*(15+x**2)/(15+6*x**2) [3/4] = x*(105+10*x**2)/(105+45*x**2+x**4) [5/4] = x*(945+105*x**2+x**4)/(945+420*x**2+15*x**4) [5/6] = x*(10395+1260*x**2+21*x**4)/(10395+4725*x**2+210*x**4+x**6) [7/6] = x*(135135+17325*x**2+378*x**4+x**6)/(135135+62370*x**2+3150*x**4+28*x**6) [7/8] = x*(2027025+270270*x**2+6930*x**4+36*x**6)/(2027025+945945*x**2+51975*x**4+630*x**6+x**8)

Some of these are shown in the image at the top of this post, along with the music-dsp approximation with "tweaked coefficients". The [7/6] approximation agrees with the truncated Lambert's continued fraction from the linked page, I'll probably end up using the [5/6] in various experiments. You can download the source code here: Pade.hs.

]]>While rendering some GPU-intensive OpenGL stuff I got scared when my graphics card hit 90C so I paused the process until it had returned to something cooler. I got fed up pausing and restarting it by hand so I wrote this small script:

#!/bin/bash kill -s SIGSTOP "${@}" running=0 stop_threshold=85 cont_threshold=75 while true do temperature="$(nvidia-smi -q -d TEMPERATURE | grep 'GPU Current Temp' | sed 's/^.*: \(.*\) C$/\1/')" if (( running )) then if (( temperature > stop_threshold )) then echo "STOP ${temperature} > ${stop_threshold}" kill -s SIGSTOP "${@}" running=0 fi else if (( temperature < cont_threshold )) then echo "CONT ${temperature} < ${cont_threshold}" kill -s SIGCONT "${@}" running=1 fi fi sleep 1 done | ts

If you want to run it yourself, I advise checking the output from nvidia-smi on your system because its manual page says the format isn't stable. Moreover I suggest monitoring the temperature, at least until you're sure it's working ok for you. Usage is simple, just pass on the command line the PIDs of the processes you want to throttle by GPU temperature, typically these would be OpenGL applications (or Vulkan / OpenCL / CUDA / whatever else they come up with next).

]]>I'll be performing together with Medial Ages at Noise in the Shed, part of Smash it Out on Saturday, August 12 at 4:00 PM - 8:00 PM at the Windmill Brixton, 22 Blenheim Gardens, London SW2 5BZ, United Kingdom.

Four hours of Noise in the Shed! From 4pm till 8, this is part of Smash it Out which is in the main room and which will go until late.

Curated by Tim Drage and Lisa McKendrix. This years line up includes:

- Harmergeddon
- Kinetic pulses, drifting textures and undulating drones

https://harmergeddon.bandcamp.com/- Nnja Riot
- Deconstruction of popular music using classical instruments or electronics and diy

https://nnjariot.bandcamp.com/

http://www.listenlisse.co.uk/nnja-riot.html

https://www.youtube.com/watch?v=Ew3LlGe0daA- Psychiceyeclix
- Audio/visual art, circuit bending, experiment, xperimental, other ideas, transcend, expand, unfold

https://www.facebook.com/psychiceyeclix/- Cementimental
- Filters, rewired circuitry and gear wreckage

https://cementimental.bandcamp.com/- General Harm
- Costumed performance and home made instruments

https://www.facebook.com/GeneralHarm/

https://youtu.be/T1JLD_I7B50?list=PLhbXgqzLycmSHmCF5oVsigY2cSOUQ-J8s- Oliotronix
- Chip tunes, 8-bit dirty rave

https://www.facebook.com/oliotronix/- Medial Ages + Claude Heiland-Allen
- Strobe noise meets live programming

https://www.youtube.com/watch?v=DSmTeZQkTT8&feature=youtu.be

http://nebularosa.net/claude_heiland_allen/- Jobby
- Last year 'MON performed Jobby, this year Jobby perform!"
A compilation of last years artists: www.noiseshed.bandcamp.com/releases

Fingers crossed the weather will be good.

]]>**hp2pretty** is a program to graph heap profiles output by
Haskell programs compiled by GHC. It makes images like this by default:

Today I hacked on it some more and added some new features, like `--reverse`

to switch the order of the bands:

and `--sort=stddev`

to sort by band standard deviation:

and `--sort=name`

to sort by cost center name:

and `--trace=50`

to combine the last percentage of trace elements into one band:

and `--bands=5`

to show only a certain number of bands:

and the icing on the cake, `--pattern`

to use pattern fills for low ink printing:

Here's the `--help`

output, using the optparse-applicative package
for command line arguments:

hp2pretty - generate pretty graphs from heap profiles Usage: hp2pretty [--uniform-scale AXES] [--sort FIELD] [--reverse] [--trace PERCENT] [--bands COUNT] [--pattern] FILES... Convert heap profile FILES.hp to pretty graphs FILES.svg Available options: --uniform-scale AXES Whether to use a uniform scale for all outputs. One of: none (default), time, memory, both. --sort FIELD How to sort the bands. One of: size (default), stddev, name. --reverse Reverse the order of bands. --trace PERCENT Percentage of trace elements to combine. (default: 1.0) --bands COUNT Maximum number of bands to draw (0 for unlimited). (default: 15) --pattern Use patterns instead of solid colours to fill bands. FILES... Heap profiles (FILE.hp will be converted to FILE.svg). -h,--help Show this help text

Version 0.7 was also released this week, featuring a contributed bugfix in parsing. You can get the latest from git here:

git clone https://code.mathr.co.uk/hp2pretty.git

or install from Hackage.

]]>A while ago I read this paper and finally got around to implementing it this week:

Real-Time Hatching

Emil Praun, Hugues Hoppe, Matthew Webb, Adam Finkelstein

Appears in SIGGRAPH 2001

The key concept in the paper is the "tonal art map", in which strokes are added to a texture array's mipmap levels to preserve coherence between levels and across tones - each stroke in an image is also present in all images above and to the right:

My possibly-novel contribution is to use the inverse (fast) Fourier transform (IFFT) to generate blue noise for the tonal art map generation. This takes a fraction of a second, compared to the many hours for void-and-cluster methods at large image sizes. The quality may be lower, but something to investigate another time - it's good enough for this hatching experiment. Here's a contrast of white and blue noise, the blue noise is perceptually much more smooth, lacking low-frequency components:

The other parts of the paper I haven't implemented yet, namely adjusting the hatching to match the principal curvature directions of the surface. This is more a mesh parameterization problem - I'm being simple and generating UVs for the bunny by spherical projection, instead of something complicated and good-looking.

My code is here:

git clone https://code.mathr.co.uk/hatching.git

Note that there are horrible hacks in the shaders for the specific scene
geometry at the moment, hopefully I'll find time to clean it up and make it
more general soon. You'll need to download the `bunny.obj`

from
cs5721f07.

The Mandelbrot set is approximately self-similar, containing miniature baby Mandelbrot set copies. However, all of these copies are distorted, because there is only one perfect circle in the Mandelbrot set. The complex-valued size estimate can be used as a multiplier for looping zoom animations, though the difference in decorations and visible distortion make the seam a little jarring. Here are some examples:

period \(3\) near \(-2\)

period \(4\) near \(i\)

period \(5\) near \(-1.5 + 0.5 i\)

The trick to the looping zoom is to find an appropriate center: if the nucleus of the baby is \(c\) and the complex size is \(r\), there is another miniature copy near the baby around \(c + r c\) with size approximately \(r^2\). Taking the limit gives a geometric progression:

\[c + r c + r^2 c + \cdots = \frac{c}{1 - r}\]

Here's the code used to render the images (also found in the mandelbrot-graphics repository):

#include <stdio.h> #include <mandelbrot-graphics.h> int main(int argc, char **argv) { (void) argc; (void) argv; const double _Complex r0 = 1; const double _Complex c0 = 0; int periods[3] = { 3, 4, 5 }; double _Complex c1s[3] = { -2, I, -1.5 + I * 0.5 }; int w = 512; int h = 512; m_pixel_t red = m_pixel_rgba(1, 0, 0, 1); m_pixel_t black = m_pixel_rgba(0, 0, 0, 1); m_pixel_t white = m_pixel_rgba(1, 1, 1, 1); double er = 600; int maxiters = 1000; m_image *image = m_image_new(w, h); if (image) { m_d_colour_t *colour = m_d_colour_minimal(red, black, white); if (colour) { for (int k = 0; k < 3; ++k) { int period = periods[k]; double _Complex c1 = c1s[k]; m_d_nucleus(&c1, c1, period, 64); double _Complex r1 = m_d_size(c1, period); for (int frame = 0; frame < 50; ++frame) { double f = (frame + 0.5) / 50; double _Complex r = cpow((r1), f) * cpow((r0), 1 - f); double _Complex c = c1 / (1 - r1); m_d_transform *rect = m_d_transform_rectangular(w, h, 0, 1); m_d_transform *move1 = m_d_transform_linear(- c / 2.25, 1); m_d_transform *zoom = m_d_transform_linear(0, r * 2.25); m_d_transform *move2 = m_d_transform_linear(c, 1); m_d_transform *rm1 = m_d_transform_compose(rect, move1); m_d_transform *zm2 = m_d_transform_compose(zoom, move2); m_d_transform *transform = m_d_transform_compose(rm1, zm2); m_d_render_scanline(image, transform, er, maxiters, colour); char filename[100]; snprintf(filename, 100, "%d-%02d.png", k, frame); m_image_save_png(image, filename); m_d_transform_delete(transform); m_d_transform_delete(zm2); m_d_transform_delete(rm1); m_d_transform_delete(move2); m_d_transform_delete(zoom); m_d_transform_delete(move1); m_d_transform_delete(rect); } } m_d_colour_delete(colour); } m_image_delete(image); } return 0; }

I used ImageMagick to convert each PNG to GIF, then gifsicle to combine into animations.

]]>The Mandelbrot is asymptotically self-similar about pre-periodic Misiurewicz points. The derivative of the cycle (with respect to \(z\)) can be used as a multiplier for seamlessly looping zoom animations. Here are some examples:

const dvec2 c0 = dvec2(-0.22815549365396179LF, 1.1151425080399373LF); const int pre = 3; const int per = 1;

const dvec2 c0 = dvec2(-0.10109636384562216LF, 0.9562865108091415LF); const int pre = 4; const int per = 3;

(The above example's Misiurewicz point has period 1, but using 3 here avoids rapid spinning.)

const dvec2 c0 = dvec2(-0.77568377LF, 0.13646737LF); const int pre = 24; const int per = 2;

Here is the rest of the code that made the images, it's for Fragmentarium with
my (as yet unreleased, but coming soon) `Complex.frag`

enhancements
for dual-numbers and double-precision:

]]>#version 400 core #include "Complex.frag" #include "Progressive2D.frag" uniform float time; // insert snippets from above in here to choose image const double r0 = 0.00001LF; vec3 color(vec2 p) { // calculate multiplier for zoom dvec4 z = cVar(0.0LF); dvec4 c = cConst(c0); for (int i = 0; i < pre; ++i) z = cSqr(z) + c; z = cVar(cVar(z)); for (int i = 0; i < per; ++i) z = cSqr(z) + c; dvec2 m = r0 * dvec2(cPow(vec2(cInverse(cDeriv(z))), mod(time, float(per)) / float(per))); const int maxiters = 1000; const double er2 = 1000.0LF; c = cVar(c0 + cMul(m, p)); z = cConst(0.0LF); double pixelsize = cAbs(m) * double(length(vec4(dFdx(p), dFdy(p)))); int i; for (i = 0; i < maxiters; ++i) { z = cSqr(z) + c; if (cNorm(z) > er2) { break; } } if (i == maxiters) { return vec3(1.0, 0.7, 0.0); } else { double de = 2.0 * cAbs(z) * double(log(float(cAbs(z)))) / cAbs(cDeriv(z)); float grey = tanh(clamp( float(de/pixelsize), 0.0, 8.0 )); return vec3(grey); } }

Wolf Jung's Mandel's "algorithm 9" allows locating zeroes of the iterated polynomial at a certain period where 4 colours meet. But I wanted to find the zeroes for lots of periods all at once. Previously I did this in a way that didn't scale efficiently to deep zooms, so I adapted the "algorithm 9" technique. Not implemented yet is the extension of this code to use perturbation techniques for deep zooms, but it should be perfectly possible.

The first thing to do is initialize the array of \(c\) values, here I use my mandelbrot-graphics library as the support code (not shown here) uses it for imaging:

void initialize_cs(int m, int n, m_d_transform *t, double _Complex *cs) { #pragma omp parallel for for (int j = 0; j < n; ++j) { for (int i = 0; i < m; ++i) { double _Complex c = i + I * j; double _Complex dc = 1; m_d_transform_forward(t, &c, &dc); int k = i + j * m; cs[k] = c; } } }

Then in the iteration step, calculate a flag for which quadrant the \(z\) iterate is in. This is set as a bit mask, so ORing many masks together corresponds to set union:

void step_zs(int mn, char *qs, double _Complex *zs, const double _Complex *cs) { #pragma omp parallel for for (int i = 0; i < mn; ++i) { // load double _Complex c = cs[i]; double _Complex z = zs[i]; // step z = z * z + c; // compute quadrant char q = 1 << ((creal(z) > 0) | ((cimag(z) > 0) << 1)); // store zs[i] = z; qs[i] = q; } }

Now the meat of the algorithm: it scans across the data with a 3x3 window, to find where all 4 colours meet in one small square. Then if that happens, check that the 3x3 square has a local minimum at its center, which means that the point found is really near a zero (a proof for that assertion follows immediately from the minimum modulus principle).

int scan_for_zeroes(int m, int n, int ip, int *ops, double _Complex *ocs, const char *qs, const double _Complex *zs, const double _Complex *ics) { int o = 0; // loop over image interior, to avoid tests in inner 3x3 loop #pragma omp parallel for for (int j = 1; j < n - 1; ++j) { for (int i = 1; i < m - 1; ++i) { // find where 4 quadrants meet in 3x3 region char q = 0; for (int dj = -1; dj <= 1; ++dj) { int jdj = j + dj; for (int di = -1; di <= 1; ++di) { int idi = i + di; int kdk = idi + jdj * m; q |= qs[kdk]; } } if (q == 0xF) { // 4 quadrants meet, check for local minimum at center double minmz = 1.0/0.0; for (int dj = -1; dj <= 1; ++dj) { int jdj = j + dj; for (int di = -1; di <= 1; ++di) { int idi = i + di; int kdk = idi + jdj * m; double mz = cabs(zs[kdk]); minmz = mz < minmz ? mz : minmz; } } int k = i + j * m; double mz = cabs(zs[k]); if (mz <= minmz && minmz < 1.0/0.0) { // we found a probable zero, output it double _Complex ic = ics[k]; int out; #pragma omp atomic capture out = o++; ops[out] = ip; ocs[out] = ic; } } } } return o; }

To be safe, the output arrays should be sized at least the desired number of elements plus the number of pixels in the image (which is the maximum number that can be output in one pass). Most of the extra space will be unused by the time the stopping condition (enough space left) is reached.

An earlier version was several times slower, partly due to caching
`cabs()`

calls for every pixel, though only very few pixels
are near a zero at any given iteration.