Late last year I implemented some coupled continuous cellular automata, inspired by Softology's experiments. Now I'm finally getting around to blogging about it. I used OpenGL shaders, here's some of the fragment shader source of the main algorithm:

void main() { vec4 s1 = texture(state, coord, 1.0); vec4 s100 = texture(state, coord, 100.0); vec4 s; for (int k = 0; k < 4; ++k) s[k] = texture(state, coord, blur[k])[k]; vec4 h = texture(history, coord); s = coupling * (s - s100) + h; s = speed * s; s = mix(s1, vec4(0.5) + 0.5 * cos(s), 0.125); h = mix(s, h, decay); state_out = s; history_out = h; }

The non-linearity of the `cos()`

on the coupled input acts like
a "reaction", the blurring (looking up reduced mipmap levels from the texture)
acts like "diffusion".

Colouring is done with another affine matrix transform, the output from which is thresholded and clamped, before edge-detection filter is applied. The edge-detection uses dFdx and dFdy, so the results are coarse (these derivatives are typically computed for blocks of 2x2 pixels, rendered together in parallel) - for better results the edge detection could be done in another pass, or the whole thing could run at double the resolution and be resized down to screen size afterwards.

Here's a video of it in action:

Here are some static images:

Here is another video, from January when it was still in colour:

Here's where you can get the code:

git clone https://code.mathr.co.uk/cca.git

Future work might be to do proper Gaussian blurs (it's separable, so even large radius might be feasible in real-time) instead of the cheap (but yielding squarish grid artifacts) mipmap reduction.

**EDIT** I worked on it some more, now in colour and with a
high quality mode that does Gaussian blur (on my system frame rate drops from
~60fps to between ~5fps and ~30fps depending on blur radius). Pictures:

I also added a mutation mode, which randomizes the parameters one by one at random. Here's a final example video showing off the new features:

]]>

After instrumenting Monotone with OpenGL timer queries I could see where the major bottleneck lay:

IFS( 7011.936000 ) FLAT( 544.672000 ) PBO( 2921.728000 )SORT( 6797.760000 ) LUP( 71136.064000 )TEX( 284.224000 ) DISP( 272.480000 )

LUP is the per-pixel binary search lookup for histogram equalisation (to compress the dynamic range of the HDR fractal to something suitable for display), the previous SORT generates the histogram from a 4x4 downscaled image. A quick calculation shows that this LUP is taking 80% of the GPU time, so is a good focus for optimisation efforts.

The 4x4 downscaled image for the histogram is still a lot of pixels: 129600. LUP involves finding an index into this array, which gives a value with around 17bits of precision. However, typical computer displays are only 8bit (256 values) so the extra 9 random-access texture lookups per pixel to get a more accurate value are a waste of time and effort. Combined with a reduction of the downscaled image to 8x8, the optimisation to compute a less accurate (but visually indistinguishable) histogram equalisation allows Monotone to now run at 30fps at 1920x1080 full HD resolution. Here are the post-optimisation detailed timing metrics:

IFS( 7087.104000 ) FLAT( 509.888000 ) PBO( 2744.864000 )SORT( 1409.440000 ) LUP( 15696.352000 )TEX( 281.472000 ) DISP( 290.848000 )

A productive day!

]]>I blogged about this before with an animation (Rolling Torus), the only thing missing now is the Fragmentarium source code, so here it is:

]]>#define providesColor #include "Soft-Raytracer.frag" uniform float time; const float pi = 3.141592653; const float s = 2.0; const float ri = s / (sqrt(s * s + 1.0) + 1.0); const float ro = s / (sqrt(s * s + 1.0) - 1.0); const float rc = 0.5 * (ro + ri); const float rt = 0.5 * (ro - ri); const vec3 rgb[5] = vec3[]( vec3(1.0, 0.7, 0.0), vec3(0.7, 1.0, 0.0), vec3(0.0, 0.7, 1.0), vec3(0.7, 0.0, 1.0), vec3(1.0, 0.0, 0.7)); float torus(vec3 z) { return length(z - rc * normalize(vec3(z.xy, 0.0))) - rt; } float plane(vec3 z) { return z.z; } vec3 spin(vec3 z) { float a = 2.0 * pi * time / 3.0; mat2 m = mat2(cos(a), sin(a), -sin(a), cos(a)); z.xy = m * z.xy; return z.xzy - vec3(0.0,ro,0.0); } vec3 baseColor(vec3 q, vec3 n) { vec2 uv = vec2(0.0); if (q.z < 0.01) { uv = q.xy * sqrt(5.0); } else { vec3 p = spin(q); float k = -1.0; float l = 1.0; if (p.z > 0.0) { l = -l; } if (length(p.xy) > rc) { k = -k; } float a = (p.z * p.z * sqrt(s * s + 1.0) + k * sqrt(max(1.0 - p.z * p.z * s * s, 0.0))) / (p.z * p.z + 1.0); float y = l * acos(clamp(a, -1.0, 1.0)) / (2.0 * pi); float x = s * atan(p.x, p.y) / (2.0 * pi); if (x < 0.0) { x = s + x; } if (y < 0.0) { y = 1.0 + y; } x = 5.0 * x; y = 5.0 * y; uv = vec2(y, x); float b = atan(1.0, 2.0); mat2 m = mat2(cos(b), sin(b), -sin(b), cos(b)) * sqrt(5.0); uv = m * uv; } vec3 t; if (mod(uv.x, 1.0) < 0.25 || mod(uv.y, 1.0) < 0.25) { t = vec3(0.0); } else { int k = clamp(int(mod(floor(uv.x) + 2.0 * floor(uv.y), 5.0)), 0, 4); t = rgb[k]; } return vec3(t); } float DE(vec3 z) { return min(torus(spin(z)), plane(z)); } #preset default FOV = 0.3 Eye = -5,-5,2.25 Target = -0.593772,1.17202,1.5097 Up = 0,0,1 EquiRectangular = false FocalPlane = 5 Aperture = 0.01062 Gamma = 1 ToneMapping = 1 Exposure = 1 Brightness = 1 Contrast = 1 Saturation = 1 GaussianWeight = 5.2308 AntiAliasScale = 1.7333 Detail = -5.47582 DetailAO = -3.26669 FudgeFactor = 1 MaxRaySteps = 317 BoundingSphere = 12 Dither = 1 NormalBackStep = 1 AO = 0,0,0,0.32456 Specular = 0.2 SpecularExp = 24.051 SpotLight = 0.737255,0.686275,0.533333,1.8841 SpotLightPos = 10,-5.0684,8.0822 SpotLightSize = 2.3 CamLight = 0.792157,0.556863,0.682353,0.6579 CamLightMin = 3e-05 Glow = 0,0.917647,1,0 GlowMax = 16 Fog = 0 Shadow = 0.82906 NotLocked Sun = -1.29354,1.01588 SunSize = 0.001 Reflection = 0 NotLocked BaseColor = 1,1,1 OrbitStrength = 1 X = 0.5,0.6,0.6,0.7 Y = 1,0.6,0,0.4 Z = 0.8,0.78,1,0.5 R = 0.4,0.7,1,0.12 BackgroundColor = 0.501961,0.501961,0.501961 GradientBackground = 1.66665 CycleColors = false Cycles = 0.1 EnableFloor = false NotLocked FloorNormal = 0,0,0 FloorHeight = 0 FloorColor = 1,1,1 #endpreset

My video piece Monotone has been accepted to the Mozilla Festival art exhibition. Mozilla Festival 2016 takes place October 28-30, at Ravensbourne College, London.

Since submitting the pre-rendered video loop I've been working on improving the real-time rendering mode of the Monotone software. The main bottle-neck at this time is the histogram equalisation to take the high dynamic range calculations down to a low dynamic range image for display. I did manage to get a large speed boost by calculating the histogram on a \(\frac{1}{4} \times \frac{1}{4}\) downscaled image, but on my hardware it only achieves \(\frac{1}{2} \times \frac{1}{2}\) of the desired resolution (HD 1920x1080). On my NVIDIA GTX 550 Ti with 192 CUDA cores I get 960x540 at 30 frames per second. Recent hardware has 1000s of cores, so perhaps it's just a matter of throwing more grunt at the problem.

If you want to try it (and have OpenGL 4 capable hardware, and development headers installed for GLFW and JACK, among other things; only tested on Debian):

git clone https://code.mathr.co.uk/monotone.git git clone https://code.mathr.co.uk/clive.git cd monotone/src make ./monotone

You can also browse the monotone source code repository. If you do have a significantly more powerful GPU than me, you can try to edit the source code monotone.cpp to change "#define SCALE 2" to "#define SCALE 1", which will make it target 1920x1080 instead of half that in each dimension. I'd love to hear back if you get it working (or if you have trouble getting it running, maybe I can help).

**UPDATE** here are some photos, the projection was really
impressive, so I'm satisfied even though the sound aspect was absent:

The festival was pretty interesting, many many things all going on at once. Highlight was the Sonic Pi workshop (though I spent most of the time dist-upgrade-ing to Debian Stretch so I could install it), and the Nature In Code workshop was also interesting (though it was packed full and uncomfortable so I didn't attend the full session).

]]>Back at the end of April last year I think I was futzing about on math.stackexchange.com answering a question about rendering negative multibrot sets, for example one produced by iterations of \(z \to z^{-2} + c\). I tried applying the atom domain colouring from the regular Mandelbrot set, but found it looked better if I accumulated all the partials with additive blending, not just the final domain. Here's a zoomed in view:

I implemented it as a GLSL fragment shader in Fragmentarium, here's the source code (which you can download too):

// Mandelbrot set for \( z \to z^{-n} + c \) coloured by Lyapunov atom domains// Created: Thu Apr 30 15:10:00 2015#include "Progressive2D.frag" #include "Complex.frag" const float pi = 3.141592653589793; const float phi = (sqrt(5.0) + 1.0) / 2.0; #group Lyapunov atom domains uniform int Iterations; slider[10,200,5000] uniform int Power; slider[-16,-2,16] vec3 color(vec2 c) {// critical point is \( 0 \) for positive Power, and \( 0^Power + c = c \)// critical point is \( \infty \) for negative Power, and \( \infty^Power + c = c \)// so start iterating from \( c \)vec2 z = c;// Lyapunov exponent accumulatorfloat le = 0.0;// atom domain accumulatorfloat minle = 0.0; int mini = 1;// accumulated colourvec4 rgba = vec4(0.0); for (int i = 0; i < Iterations; ++i) {// \( zn1 \gets z^{Power - 1} \)vec2 zn1 = vec2(1.0, 0.0); for (int j = 0; j < abs(Power - 1); ++j) { zn1 = cMul(zn1, z); } if (Power < 0) { zn1 = cInverse(zn1); }// \( dz \gets Power z^{Power - 1} \)vec2 dz = float(Power) * zn1;// \( z \gets z^{Power} + c \)z = cMul(zn1,z) + c;// \( le \gets le + 2 log |dz| \)float dle = log(dot(dz, dz)); le += dle;// if the delta is smaller than any previous, accumulate the atom domain domainif (dle < minle) { minle = dle; mini = i + 1; float hue = 2.0 * pi / (36.0 + 1.0/(phi*phi)) * float(mini); vec3 rainbow = 2.0 * pi / 3.0 * vec3(0.0, 1.0, 2.0); vec3 domain = clamp(vec3(0.5) + 0.5 * sin(vec3(hue) + rainbow), 0.0, 1.0); rgba += vec4(domain, 1.0); } }// accumulated 'iterations' logs of squared magnitudes// so divide by 2 iterationsle /= 2.0 * float(Iterations);// scale accumulated colour and blacken interiorreturn mix(rgba.rgb / rgba.a, vec3(le < 0.001 ? 0.0 : tanh(exp(le))), 0.5); }

More negative powers than -2 don't look very good, though.

]]>The classic Sayagata tiling is a Euclidean (flat) plane tesselation based on a square grid. For May's calendar image I thought it would be fun to make a hyperbolic variation, with 5 squares meeting at each vertex instead of 4. Furthermore, to wrap the hyperbolic plane into a repeating ring shape like what Bulatov did (warning, that page is slow loading but worth it in the end).

I implemented it as a GLSL fragment shader in Fragmentarium (download my source code) with a few constants for changing the tiling parameters (the hyperbolic symmetry group, the orientation across the band, number of repetitions, whether to draw a pretty pattern or just triangles). Here are a few variations:

The code has a few comments; the gist of the overall operation is to transform the coordinates from the ring back into the PoincarĂ© disc model of hyperbolic geometry, then repeatedly apply hyperbolic symmetries towards a central fundamental region, which finally gets the Sayagata texture. Note that tweaking some of the values leads to ugly seams with the parts not lining up - still some bugs I guess! (WONTFIX DONTDOTHAT)

]]>**UPDATE** my GL 3.3 version may have been
faster, but modifications to the original program are faster
still (there was a misapplied patch slowing things down
drastically, that getting fixed made it easier to improve).
Check
this rrv issue thread
for more details.

Radiosity is a method for computing diffuse lighting. Unlike raytracing, radiosity is viewpoint independent, which means the lighting calculations can be performed once for a given scene, then visualized multiple times with different virtual camera positions. But ray tracing also supports specular reflections and transparency, so eventually a more complete renderer could combine both.

While searching for radiosity implementations I found rrv, which uses OpenGL to render the view from each patch (triangle) in the scene. My first experiences were disappointing - it was very slow, but that turned out to be missing optimisation flags. But I found some further ways to optimize it - first using vertex buffers instead of glBegin() so that the scene geometry is uploaded to the GPU only once instead of per-patch, secondly using a large flat array instead of a C++ std::map to accumulate the results of rendering. These optimisations gave a speed boost around 7x, but I wasn't satisfied.

I ended up porting the radiosity renderer core to run almost entirely on the GPU, using OpenGL 3.3 (though it could probably be ported to OpenGL 2.1 with a few extensions like floating point textures and framebuffers). The speed boost is around 30x faster than the original OpenGL 1 implementation. Here's rrv's room4 demo scene with lighting calculated by my port in a little over 12 minutes (the visualizer code remains unchanged):

Another change I made involved the form factor calculations for hemicube projection. Radiosity ideally projects to a hemisphere and from there to a circle, but rectangular grids are more convenient for computers. So radiosity implementations tend to project to a hemi-cube, rasterizing the scene 5 times, once for the top and each of the four sides. Then each pixel in the result is scaled by a delta form factor, so that it corresponds more closely to the hemisphere circle projection.

rrv's form factor calculations used a product of cosines, which looked suspect to me as it didn't take into account the edges and corners of the hemicube, so I did some searching and found a paper which gave some different formulas:

The Hemi-cube: a Radiosity Solution for Complex Environments

Michael F. Cohen and Donald P. Greenberg

ACM SIGGRAPH Volume 19, Number 3, 1985

I implemented them in a test program and plotted a comparison between RRV2007 (magenta) and Cohen1985 (green):

The difference is small, but visible in the visualizer as slight shape differences between quantized bands on the gradients. Here's a comparison between the output after 1 step with the two different form factors, amplified 64 times (mid-grey is equal output):

When time allows, I have some ideas for some additional features, like lighting groups (compute the radiosity for each light separately, then combine them at visualization time - hopefully I'm correct in assuming linearity - allowing the brightness (but not colour) of each light to be adjusted separately) and scene symmetry (like an infinite corridor repeating every 5m or so, the symmetry means the radiosity for translationally equivalent parts of the scene must be identical).

I put my changes in a branch at code.mathr.co.uk/rrv, which may end up merged to the upstream at github.com/kdudka/rrv.

]]>I've already blogged a bit about this calendar image for April (yep, late again...). I made a diagram using Fragmentarium to show the principle behind it:

Five points forming a regular pentagon rotate, their vertical coordinates make five sine waves. Each wave is split into a number of rectangles, whose width depends on the height of the wave. Then all five sets of rectangles are overlayed. Because the centroid of the pentagon is fixed no matter the rotation angle, the sum of the waves remains constant - this property is used in windowed overlap add resynthesis of FFT for frequency-domain processing in audio.

But it's not the same as the original, which you can tell by zooming in:

The original (above) has an equal amount of white space between each coloured strip, while in the recreation (below) the amount of white space changes. And the order of the colours in the recreation is flipped from the original, and the coloured strips in the recreation are a bit narrower than the original too.

You can download the source code for the recreation: gradient.frag. For extra fun switch to animation mode in Fragmentarium to see the pentagon rotate with the waves streaming out from it. I did render a video but it looks really horrible compressed, so unless you have Fragmentarium or you wait until I get around to porting this code to WebGL (not very likely..) you'll just have to imagine it.

]]>mightymandel now has a home page, where you can find documentation, downloads, links, and an image gallery. Highlights from the changes since v15 include:

- sliced rendering
- Splits iterate computations into smaller batches, allowing larger images to be rendered in the same amount of video memory (each pixel needs 12-16 bytes, but each iterate needs 128-160 bytes).
- progressive rendering
- When sliced rendering is enabled, the slices are ordered in such a way that a lofi pixelated image appears first, which gradually refines into the final hifi image.
- improved glitch correction
- Finding reference points now uses a two-pass blob extraction algorithm, first glitched blobs are extracted, then the most-glitched sub-blobs are extracted. Small blobs are ignored (default 1 pixel) for faster completion (can be disabled for previous behaviour).
- improved no-de colouring
- Now closer to the algorithm described in my blog post faking distance estimate colouring which should make it smoother.
- zoom motion blur
- The zoom assembler adds motion blur to reduce unpleasant strobing for fast zooms at standard frame rates. The shutter speed is variable, so you can adjust the amount of blurring to suit your tastes.
- downloads
- No longer do you need git to get mightymandel, there are source code tarballs and Windows binaries (cross-compiled from Linux).

A final note, my future blog posts about mightymandel will have fewer tags, you can subscribe to the mightymandel tag feed to stay updated.

]]>Not the usual pretty pictures, but I want to explain the bugs I fixed in v15 of mightymandel released today.

The top row was the killer bug, from left to right

- image is being calculated (unfinished areas in blue)
- suddenly the whole top quarter of the image is corrupted, including areas that weren't even being calculated
- this is what it should have looked like (and what it looked like when I fixed the bugs)
- finished! only a few little glitches visible
- not to worry, mightymandel can mask small glitches by blending neighbour pixel colours

The second row has more visualisation of glitch detection, correction by adding new reference points, and masking. The last image (big yellow splat) is what happens when you set the sharpness threshold too far away from sensible values - it thinks the yellow area will never escape, so it must be interior. The third row shows the effect of some different sharpness settings, from high threshold at the left to low threshold (high quality, takes much longer) at the right. Setting sharpness is a new feature in v15 (before it was a hardcoded value).

The last row illustrates two more bugs and solutions. The first two pictures were generated with the --tile option to make images bigger than can be handled comfortably in one go. The left has a big yellow square, which was caused by the computed reference point being NaN (not a number, oh the joys of floating point arithmetic) which corrupted everything. Checking for NaN and using an alternative reference point fixed that issue.

The final three pictures show a particularly tough location to render (it's "other-worlds-frightening-digital-2.kfr" from the Kalles Fraktaler gallery zip, but I don't know who found those coordinates). The really tight spirals make the iteration count go up very quickly, which defeats the completion checker (the part of the code that decides when to give up - mightymandel doesn't have a fixed iteration limit, it just keeps going until it seems reasonable to stop). Setting the --max-glitch and --sharpness flags both to 0 for best quality still doesn't fix it (second image of the three). Eventually I tried some stuff (but I forgot exactly what - oops) and got the final image, which took far too long to render for comfort's sake (10s of minutes - which is nothing really, without the perturbation techniques it would take days).

See the CHANGES file in the git repository for full details, but here's the highlights:

- random image corruption bugs fixed, reference finding bugs fixed
- combine --tile and --zoom for high res zoom sequences
- much more documentation (both for users and developers)
- many more example parameters from around the web
- a vastly improved test suite script (and test suite pass rate is getting good too)

Get mightymandel v15 here:

git clone http://code.mathr.co.uk/mightymandel.git git checkout v15

And you can also browse the source code at mightymandel on gitorious mightmandel on code.mathr.co.uk. PS: here's a pretty picture after all, just for you.

]]>Two stable releases of mightymandel since the last post, here are the changes from v12 to v13:

v13 mightymandel v13 stable release Reduced dependencies: * glm replaced by simple maths * glib regex replaced by new parsers New parsers: * new support for ppm as output by mightymandel * new support for mdz xmin/xmax/ymax variant * all supported formats render the right size (assuming window aspect ratio is correct) New weight control for -no-de: * can make lines lighter or darker if desired * corresponding feature for DE coming soon Improved series approximation: * for deep zooms it is initialized on CPU to avoid overflow on GPU * warning if image is likely to fail due to insufficient precision Miscellaneous: * zoom assembler updated, now can pipe direct * new script to convert ppm to png preserving mightymandel metadata with location information * new script to make assembling tiled images easy

And from v13 to v14 (current stable):

v14 mightymandel v14 stable release Build system improvements: * ported from C++11 to C99 * new Makefile flags XTRA_COMPILE_FLAGS and EXTRA_LINK_FLAGS which may be useful for compiling on interesting environments without patching the Makefile itself Rendering improvements: * -weight now works with DE * -no-de weight changed to match DE appearance * glitched interiors no longer stay glitched * interiors coloured yellow instead of white User interface improvements: * fixed buggy mouse button handling * inverted weight key control order * -version flag prints version and exits Documentation improvements: * fixed git instructions in documentation * documented PageUp/PageDown zoom key controls * added hardware comparison benchmarks

Source code at mightymandel on gitorious mightmandel on code.mathr.co.uk.

]]>I've been working on mightymandel a lot the past few days, and finally got it in a clean enough state to make another release, and this one is worthy of an announcement. mightymandel is a GPU-based Mandelbrot Set explorer (currently based on OpenGL 4.1). It uses perturbation techniques with series optimisation to allow deep zooms without having to compute each pixel at high precision. It now has automatic glitch correction, and since v11 there is a massive speed boost: v12 is (almost) twice as fast as v11! (And now it's faster than Kalles Fraktaler 2.7.3 running in wine32.)

Embarassingly, most of the speed boost came from removing a bunch of code. Previously, I tried to find better reference points by atom location algorithms, which require a lot of iterations at high precision. Now, I just pick the center of the glitch blob. Much simpler! Images might need more reference points, but the overhead is reduced so much that it doesn't really matter, and most images will complete rendering much quicker.

There's now a large set of example parameters, if you don't know where to start looking when faced with the infinite. I can't claim credit for these, they're collected from a few online sources, links are in the README inside each subdirectory.

mightymandel now has a test suite, a crufty shell script that renders all the parameters in the examples directory, and gives you a directory of images and a long log file. It takes about 35mins to run on my system, and while it's running it's pretty much impossible to do anything else because of the new mightymandel windows opening every few seconds. But it's worth it, it's much easier to test regressions and improvements before making a new release. Current success rate is 57%, you can see the full details in the TESTING file in the repository. Here's the current set of non-failing images:

Getting mightymandel currently requires git (to get the source code) and compiling it manually. Here's the quick guide, more detail is in the README (including required packages for Debian-based distributions):

git clone http://code.mathr.co.uk/mightymandel.git cd mightymandel git checkout $(git tag -l | sort -V | tail -n 1) make -j6 -C src ./src/mightymandel

You can browse the mightymandel repository browse the mightymandel repository if you like, too. Happy fractal exploring!

]]>