Based on a drawing from a year an a half ago:

I showed it to Brent Yorgey in the **#diagrams** IRC channel and he made this:

strip :: (Double -> Double) -> Double -> Double -> Int -> Double -> Diagram SVG R2 strip f lo hi n offset = [lo, lo + (hi - lo) / (fromIntegral n - 1) .. hi] # map (square . f) # hcat' with {sep = offset, catMethod = Distrib} # fc black example = vcat' with { sep = 3, catMethod = Distrib } (replicate 7 str) # centerXY # pad 1.5 where str = strip (\x -> cos x + 1) (-pi) pi 23 3--Lozenge (Brent Yorgey)

Then I showed him how they interleave and he made this:

strip :: (Double -> Double) -> Double -> Double -> Int -> Double -> Diagram SVG R2 strip f lo hi n offset = [lo, lo + (hi - lo) / (fromIntegral n - 1) .. hi] # map (square . f) # hcat' with {sep = offset, catMethod = Distrib} lozenge = vcat' with { sep = 3, catMethod = Distrib } (replicate 7 str) # centerXY where str = strip (\x -> cos x + 1) (-pi) pi 23 3 example = mconcat [ lozenge # fc black , lozenge # fc red # translateY (-1.5) # translateX (width lozenge / 2 - 4.5) ]--Lozenges (Brent Yorgey)

These are so much simpler than my original code which was horrible C numerical stuff to converge a waveform to the right shape, printing numbers which I fed into GNUplot and then editing the image with GIMP to correct the aspect ratio. I don't have that original code any more, but here's a similar version:

#include <math.h> #include <stdio.h> #include <stdlib.h> const double pi = 3.141592653589793; int cmp(const void *x, const void *y) { const double *a = x; const double *b = y; if (*a < *b) return -1; if (*a > *b) return 1; return 0; } double y(double x) { return (cos(pi * x) + 1) / 2; } int main(int argc, char **argv) { if (argc < 2) { return 1; } int n = atoi(argv[1]); double a[2]; double x[2][n+1]; int w = 0; a[0] = 0.001; for (int i = 0; i <= n; ++i) { x[0][i] = (2.0 * i) / (2.0 * n + 1.0); } x[1][0] = 0; while (1) { printf("a\t= %g\n", a[w]); for (int i = 0; i <= n; ++i) { printf("x[%d]\t= %g\t%g\n", i, x[w][i], y(x[w][i])); } double s = y(x[w][0]); for (int i = 1; i <= n; ++i) { s += 2 * y(x[w][i]); } a[1 - w] = 1 / s; for (int i = 1; i <= n; ++i) { x[1 - w][i] = x[w][i-1] + a[w] * ((y(x[w][i-1]) + y(x[w][i]))/2 + y(x[w][n+1-i])); } w = 1 - w; qsort(&x[w][0], n+1, sizeof(x[w][0]), cmp); } return 0; }gcc -std=c99 -lm lozenge.c ./a.out 19 | head -n 2100 | tail -n 20 > lozenge.dat ; gnuplotunset key unset xtics unset ytics unset border set style fill solid set terminal png size 8192,8192 set output "lozenge-raw.png" plot [-0.2:1.2] [-0.7:0.7] "lozenge.dat" using 3:(0):\ ($3 - $4 * 0.0512405/3):($3 + $4 * 0.0512405/3):\ ((-1+$4) * 0.0512405/3):((1-$4) * 0.0512405/3) with boxxyerrorbars,\ "lozenge.dat" using (1-$3):(0):\ (1-$3 - $4 * 0.0512405/3):(1-$3 + $4 * 0.0512405/3):\ ((-1+$4) * 0.0512405/3):((1-$4) * 0.0512405/3) with boxxyerrorbarsgimp (crop image)

A further experiment led to this gradient with 5 colours out of phase:

Do check out the links above to the Haskell diagrams pastebin, it's quite awesome.

]]>I drew part of a \(\{6,3\}\) tiling of the plane with stylized fish, the full tiling would look more like:

The shape of the fish is constructed with compass and straightedge. First create a hexagon with origin \(O\), vertices \(V_i\) and edge midpoints \(E_i\). Define \(\operatorname{line}(p, q)\) as the line passing through \(p\) and \(q\), and \(\operatorname{circle}(p, q)\) as the circle centered at \(p\) passing through \(q\), the remaining points of the fish can be found:

\[\begin{align*} c &= \operatorname{line}(E_3, E_4) \cap \operatorname{line}(V_4, V_0) \\ i &= \operatorname{line}(E_4, E_5) \cap \operatorname{line}(V_5, V_3) \\ e &= \operatorname{circle}(V_4, c) \cap \operatorname{line}(E_4, E_5) \text{ furthest from } E_4 \\ g &= \operatorname{circle}(V_5, i) \cap \operatorname{line}(E_4, E_3) \text{ furthest from } E_4 \end{align*}\]

It turns out that this construction is also valid in hyperbolic space, because nothing depends on the existence of unique parallels. Here's a diagram showing four hexagons about each vertex in the Poincaré disk model of hyperbolic geometry:

And another showing three octagons about each vertex:

In hyperbolic space areas and angles are connected. The key step in making hyperbolic tilings is finding the side lengths of the fundamental triangle, between the origin \(O\), vertex \(V\), and midpoint of neighbouring edge \(E\). All the angles are known given the Schläfli symbol \(\{p,q\}\): they are \((\pi/p, \pi/q, \pi/2)\). The side lengths can be calculated using the hyperbolic law of cosines:

\[\begin{align*} \cosh(|OV|) = \frac{\cos(\pi/2) + \cos(\pi/p) \cos(\pi/q)}{\sin(\pi/p) \sin(\pi/q)} \\ \cosh(|OE|) = \frac{\cos(\pi/q) + \cos(\pi/p) \cos(\pi/2)}{\sin(\pi/p) \sin(\pi/2)} \\ \cosh(|EV|) = \frac{\cos(\pi/p) + \cos(\pi/q) \cos(\pi/2)}{\sin(\pi/q) \sin(\pi/2)} \end{align*}\]

Here are some fish tiling variations in hyperbolic space, with Poincaré half-plane model representations too:

I wrote the implementation in Haskell. My code consists of a library for compass-and-straightedge construction with instances for Euclidean (flat) and hyperbolic (negatively curved) space with spherical (positively curved) space a possibility in the future, along with embeddings into Euclidean space for visualisation using the Diagrams library. The cool part is that the same code generates all the variations from a few parameters.

]]>In my previous post on Escher's Butterflies I made a symmetrical plane tiling of butterflies with 4 colours. I was curious whether other numbers of colours could result in a symmetrical triangle vertex colouring. (There's pictures below if my rambling gets too long...)

Starting from a triangular lattice (like isometric graph paper), pick two of the three line directions as axes. Then you can label a point P with unique coordinates (p,q), where p is the number of steps along the first axis and q is the number of steps along the second axis. Set the origin at (0,0), pick your P at (p, q), and form an equilateral triangle with a point Q at (-q, p+q) (for the acute axes I chose; obtuse axes might have different coordinates for Q, haven't checked). Then you can extend the triangle OPQ into its own, larger, triangular lattice.

It turns out that the triangle OPQ has an area p²+q²+pq times the area of the smallest triangles forming the lattice. Calling this number n, we could use n colours in the n smallest triangles inside the OPQ triangle, but that wouldn't help colouring the vertices of the triangular lattice, is the whole point of this exercise.

A simple way to colour the vertices might be to start at one vertex of the OPQ lattice, pick one of the axes, then step along in that direction assigning the next colour to each base lattice vertex - but eventually you will hit another OPQ lattice vertex. By symmetry, we want each OPQ lattice vertex to be assigned the same colour - and any translation or (k pi / 3) rotation of the OPQ lattice across the base lattice should give an O'P'Q' latticce with all its vertices the same colour.

So far we have a row along one axis of colour-assigned base lattice vertices, from one OPQ lattice vertex as far as possible, stopping one base step before the next OPQ lattice vertex on that axis. Now we can copy/paste those assignments to all of the other OPQ lattice vertices. So far so good, but we might not have assigned colours to all the base lattice vertices yet, because the OPQ lattice might not hit all rows of the base lattice - for example with (p,q)=(2,2) the OPQ lattice misses every other row.

Here's where it gets a bit fuzzy because I was hacking on code late at night half-asleep, randomly changing things in the hope that it would work, and eventually it did - so I don't know why it works, or even if it works in absolutely every single case, but it seems fine so far... But the number of rows you need to fill is the greatest common divisor of p and q, and each row has exactly the same length by translational symmetry. So you get a parallelogram of height (p`gcd`q) with some width. Somehow it all works out so that the width is n divided by that height (recall n = p²+q²+pq, and h = p`gcd`q so h divides both p and q so h divides n exactly and w = n / h gives an integer).

Anyway, here's some diagrams that might explain it more clearly, followed by butterfly tessellations in various numbers of colours:

And you can get the code (for both the diagrams and tessellations):

git clone git://gitorious.org/maximus/butterflies.git

Or browse butterflies on Gitorious.

]]>A decade of irc logs for the #haskell irc channel on freenode filtered down to the most often occurring 1000 words of 4 letters or more, arranged in an infinite fractal zoom - each word is made up of the words that most likely follow it.

This is from six months ago but I didn't post about it yet.

Video available on the Internet Archive:

Made with an early version of the next generation of *graphgrow*,
whose source code is available here:

git clone git://gitorious.org/maximus/graphgrow.git

Or browse graphgrow on Gitorious.

]]>Welcome!

]]>Zooming rotating fractal gearbox demo, implemented with OpenGL. Took a lot of tricky trial and error to get the maths for the rotations correct. One shader generates the gear shape, and another hue-rotates the smaller copies - using texture feedback to generate the full fractal.

Features planned in future versions include randomized zooming into gears other than the centre, but that might take some time.

Code available on Hackage:

cabal install gearbox

Or download the source: gearbox-1.tar.gz

]]>Iterated function systems rendered with a method similar to The Sky Cracked Open using functions similar to B15 (2008) and PostScript experiments (2007).

Code available on Hackage:

cabal install snowglobe

Or download the source: snowglobe-1.tar.gz

]]>The 43MB video linked above is pretty much generated by these 70 lines of GLSL fragment shader:

#extension GL_EXT_gpu_shader4 : enable uniform vec4 u; // unit uniform vec4 v; // unit uniform vec4 w; uniform float bigness; uniform vec3 lightPos; void main() { vec2 z = bigness * gl_TexCoord[0].xy; vec4 p = u * z.x + v * z.y + w; // nearest sphere vec4 o1 = round(p); vec4 o2 = o1 + vec4(0.5); if (p.x < o1.x) o2.x -= 1.0; if (p.y < o1.y) o2.y -= 1.0; if (p.z < o1.z) o2.z -= 1.0; if (p.w < o1.w) o2.w -= 1.0; vec4 d1 = o1 - p; vec4 d2 = o2 - p; float l1 = dot(d1,d1); float l2 = dot(d2,d2); vec4 o = o1; bool odd = false; if (l2 < l1) { o = o2; odd = true; } // intersection circle float ou = dot(u, o - w); float ov = dot(v, o - w); vec4 c = u * ou + v * ov + w; vec4 oc = o - c; float d = dot(oc, oc); float r2 = 0.25 - d; vec2 dz = z - vec2(ou, ov); float dr2 = dot(dz, dz); vec4 colour = vec4(0.0, 0.0, 0.0, 0.0); float alpha = 0.0; float brightness = 0.0; float specular = 0.0; if (dr2 < r2) { // inside circle float k = o.x + o.y + o.z + o.w; bool even = abs(2.0 * floor(k / 2.0) - k) < 0.001; if (odd) { if (even) { colour = vec4(1.0, 0.7, 0.2, 1.0); } else { colour = vec4(0.7, 1.0, 0.2, 1.0); } } else { if (even) { colour = vec4(1.0, 0.2, 0.7, 1.0); } else { colour = vec4(0.7, 0.2, 1.0, 1.0); } } // sphere shading const float shine = 16.0; float height = sqrt(r2 - dr2); vec3 pos = vec3(z, height); vec3 light = normalize(lightPos - pos); vec3 normal = normalize(vec3(dz, height)); float dlm = max(dot(light, normal), 0.0); vec3 reflected = 2.0 * dlm * normal - light; specular = pow(max(reflected.z, 0.0), shine); brightness = 0.0625 + dlm; } else { discard; } gl_FragColor = vec4(mix(colour.rgb * brightness, vec3(1.0), specular), 1.0); }

In 3D Euclidean (flat) space, you could imagine filling it with cubes of side length 1, much like graph paper in 2D. Then you could fill each cube with a sphere of radius 1/2. In 3D the gap between those spheres (at the corners of each cube) has diameter sqrt(3) - 1 (about 0.732). But, in 4D space a similar hypercube lattice each filled with spheres leaves a gap with diameter sqrt(4) - 1 (exactly 1). This means you can fit another radius 1/2 sphere at each corner gap.

The spheres with integer coordinates (the ones at the corners) can be grouped together, and the ones with integer+1/2 coordinates (the ones in the middle of the cubes) can form another group. Moreover, you can use a parity argument to further subgroup each half: determining whether the sum of the coordinates is odd or even gives four groups of spheres in total, arranged in a hard to imagine 4D chessboard-like pattern.

In 3D space you might imagine taking a 2D slice through the grid of spheres. Depending where and at what angle you slice, you might get a regular square grid of circles and gaps or a more unpredictable pattern of differently sized circles and gaps where you chop through different parts of the spheres. You can do the same in the 4D lattice, which can similarly give a regular square grid of circles or irregular-looking patterns.

So, enough imagining: how to actually visualize it. The first step is to define a 2D slice through 4D space:

P(u,v) = u*U + v*V + W

Here lowercase letters are scalar real numbers, and uppercase letters are vectors. For convenience, assume U and V have length 1 and point in different enough directions. W is the origin of the plane, and U and V can be thought of as the local axes within the 2D plane.

The next step given a point (u,v) in this plane (which will eventually be drawn as a pixel on the screen), is to find the center of the nearest sphere in the lattice. The nearest integer sphere can be found by rounding each coordinate of P(u,v), but the nearest half-integer sphere is a bit more tricky - the method in the shader source code above checks in each dimension if the rounded coordinate is bigger or smaller than unrounded coordinate, and picks the neighbouring half-integer coordinate so that the unrounded coordinate is between the two sphere coordinates. Then compare the actual distance between these sphere centers and P(u,v), and pick the closest.

Now we have the closest sphere centered at O, but we want to find the circle that is formed if our plane is slicing through that sphere. I thought the algebra would be hairier than it was, but luckily symmetry comes to the rescue. The center of the circle P(ou, ov) is the nearest point on the plane to the center of the sphere, which can be found by minimizing the distance function (knowing that the differential of a function is 0 at extrema). Leaving out the workings (and assuming U and V are unit length):

ou = dot(U, O - W) ; ov = dot(V, O - W)

But there might be no circle here at all, if (u,v) happens to fall in a gap. Luckily we know the sphere has radius 1/2, so we can check that ||P(ou, ov) - O|| < 1/2. The next step is to find the radius of the circle, which is quite straightforward by Pythagoras' Theorem - the line from the sphere center to the circle center is perpendicular to the plane of the circle, and the hypotenuse has length 1/2, and we already computed the length of the adjacent side.

Now (unless we bailed out with no circle) we have our starting point (u,v), the center of a circle that contains it (ou, ov), and the radius of the circle r. But we want to make it look like a sphere! So we need to find the height of the sphere's surface above the plane. Then for shiny lighting it's easy to calculate the surface normal (as a sphere is uniformly round, the normal points away from its center). With the surface position and normal, and a light moving around in 3D above our plane, standard Phong lighting can make it look like proper disco balls.

]]>A short fractal video loop rendered with continuous escape time for an iterated function system.

**The Generator** for this iterated function system is
three lines in a zigzag:

Given the fractal dimension D (between 1 and 2), the length L (between 0 and 1) of the middle line segment can be calculated by finding a root (either by standard numerical methods, or by binary search as the function is strictly monotonic) of:

2 * (sqrt ((1 - L^2) / 4))^D + L^D - 1 = 0

**The Affine Transformations** (each T_{i}
expressed as a 3x3 matrix using homogeneous coordinates) that take the
whole to the parts are calculated by:

T_{i}= post * M(a_{i}, l_{i}, v_{i}) * pre

a_{1}= pi/2 - acos L

a_{2}= - acos L

a_{3}= a_{1}

l_{1}= sin (acos L) / 2

l_{2}= L

l_{3}= l_{1}

v_{1}= (0, 0)

v_{2}= (l_{1}* cos a_{1}, l_{1}* sin a_{1})

v_{3}= (1, 0) - v_{2}

M(a, l, (x, y)) = (c, s, x; -s, c, y; 0, 0, 1) where c = l * cos a, s = l * sin a

post = M(0, 1, (-0.5, 0))

pre = M(0, 1, (0.5, 0))

**Continuous Escape Time** for an iterated function
system is defined (for points not in its attractor) by:

E(x)

= 0, ||x|| >= R;

= max_{i}{ (log R - log ||x||) / (log ||F_{i}^{-1}(x)|| - log ||x||) }, ||x|| < R, ||F_{i}^{-1}(x)|| >= R for all i

= 1 + max_{i}{ E(F_{i}^{-1}(x)) }, otherwise

which can be described as the length of the *longest* path that
escapes beyond a circle of radius R. Computing this on a per-pixel basis
by exhaustive search through all paths takes O(m^n) time, where m is the
number of functions in the system and n is the eventual escape count.

**A Faster Method Of Computation** taking O(m*n) time,
can be achieved by rasterizing a square big enough to contain a circle
of radius R centered on the origin, initially filled with the first two
cases of the escape time equation E(x), and incrementally refining it
by accumulating the effects of each transformation in parallel. This
approach lends itself well to GPU-based computation, though the images
are less accurate (noticeably blurrier) than per-pixel recursion.

To determine the number of incremental refinement passes to perform, an estimate can be derived from the largest contraction factor and a desired accuracy (eps = 0.001 for example):

l_{max}= max_{i}{ l_{i}}

passes = ceiling (log_{lmax}eps)

**The Initialisation Pass** fills a raster, where er = R
and rho = l_{max}, using -1 as a sentinel for unknown values:

uniform float er; uniform float rho; void main() { vec2 p = er * (gl_TexCoord[0].xy * 2.0 - vec2(1.0)); float l = length(p); float n; if (l >= er) { n = 0.0; } else if (er > l && l >= rho * er) { n = (log(er) - log(l)) / -log(rho); } else { n = -1.0; } gl_FragData[0] = vec4(n); }

**The Refinement Passes** accumulate transformations
(each ts[i] is the matrix inverse of T_{i}):

uniform sampler2D src; uniform float er; uniform mat3 ts[3]; void main() { vec2 p0 = er * (gl_TexCoord[0].xy * 2.0 - vec2(1.0)); float m = -1.0; for (int i = 0; i < 3; ++i) { vec3 p = ts[i] * vec3(p0, 1.0); vec2 q = p.xy / p.z; float l = length(q); if (l < er) { m = max(m, texture2D(src, (q / er + vec2(1.0)) / 2.0).x); } } if (m >= 0.0) { m += 1.0; } m = max(m, texture2D(src, (p0 / er + vec2(1.0)) / 2.0).x); gl_FragData[0] = vec4(m); }

The usual GPGPU-style ping-pong technique between textures bound to frame buffer objects is used here.

**The Colouring Pass** assigns colours to the escape
times, where speed = 2 pi / passes:

uniform sampler2D src; uniform float speed; void main() { vec2 p = gl_TexCoord[0].xy; float n = texture2D(src, p).x * speed; vec3 yuv; if (n < 0.0) { yuv = vec3(0.0); } else { yuv = vec3(clamp(log(1.0 + n / 4.0), 0.0, 1.0), 0.125 * sin(n), -0.125 * cos(n)); } vec3 rgb = yuv * mat3(1.0, 0.0, 1.4, 1.0, -0.395, -0.581, 1.0, 2.03, 0.0); gl_FragData[0] = vec4(rgb, 1.0); }

**Cropping** to the interesting region can be performed
by manipulating texture coordinates in the colouring pass:

drawCropped er = renderPrimitive Quads $ do t (0.5-tx) (0.5+ty) >> v 0 1 t (0.5-tx) (0.5-ty) >> v 0 0 t (0.5+tx) (0.5-ty) >> v 1 0 t (0.5+tx) (0.5+ty) >> v 1 1 where aspect = 16/9 -- window aspect ratio tx = 1 / (2 * er) ty = 1 / (2 * er) / aspect t, v :: GLdouble -> GLdouble -> IO () t x y = texCoord (TexCoord2 x y) v x y = vertex (Vertex2 x y)

**Benchmarks**: it took around 12 minutes to render
375 frames of 1280x720 video with D varying from 1 to 1.75 and R = 8
with 8192px square 1-channel float textures on an NVIDIA GTX 550Ti.

**References**:

- continuous escape time
- "Rendering Methods for Iterated Function Systems" D Hepting, P Prusinkiewicz

The diagram above shows two views: the top half is a view of the complex plane with some circles indicating viewports at different sizes (zoom levels) and locations. The problem I wanted was to solve was:

how might one derive the shortest animation path from one viewport to another.

My intuition was that the shortest path would involve zooming out, moving sideways, and zooming in again, but sharp corners between the zooming sections and the translating sections would be ugly, and almost certainly add to the length of the path. I remembered the diagrams on the wikipedia page on the Poincaré half-plane model of hyperbolic geometry, and thought that perhaps it would apply. In that model, the shortest path (aka a straight line) between two points in hyperbolic space maps to a circular arc through Euclidean space.

Mapping the radius of the view to the imaginary coordinate (height) in the half-plane, and the distance along the line between the centres of the viewports to the real coordinate, then the Poincaré half-plane geodesics (shortest paths) indeed zoom out, translate, and zoom in again in the smoothest possible way. This is shown in the bottom half of the diagram on this page.

(At this point I might remark that I tried implementing this some time ago in Mandulia, but I didn't understand it enough, I got as far as circular arcs but the scaling was all wrong and I had to apply many ad-hoc hacks to get it to look ok.)

Some experimentation followed, implemented like this in Haskell:

-- representation of viewports data Poincare a = Poincare !(Complex a) !a deriving Show -- hyperbolic distance between viewports distance (Poincare x1 y1) (Poincare x2 y2) = acosh (1 + (magnitudeSquared dx + dy * dy) / (2 * y1 * y2)) where dx = x2 - x1 ; dy = y2 - y1 -- hyperbolic geodesic -- the parameter t in [0..1] interpolates between the endpoints geodesic p1@(Poincare x1 y1) p2@(Poincare x2 y2) | x1 == x2 = \t -> Poincare x1 (y1 * (y2 / y1) ** t) | otherwise = \t -> let x:+y = go (exp (t * dt)) in Poincare (x1 + (x :+ 0) * dir) y where go t = (b :+ (a * t)) / (d :+ (c * t)) -- interpolate geodesic a = d * y1 b = -c * y1 [c, d] = toList . nullVector $ (2><2)[ eT * mdx, y2 - y1 * eT, y1 - eT * y2, mdx ] eT = exp dt dt = distance p1 p2 -- hyperbolic distance between viewports dx = x2 - x1 -- vector between viewport centers mdx = magnitude dx -- distance between viewport centers dir = cis (phase dx) -- unit vector between viewport centers

The complicated bit is the computation of the geodesic parameters, which I derived like this:

{ start with a pair of equations for the end points } (a i + b) / (c i + d) = 0 + i y1 (a i e^T + b) / (c i e^T + d) = |x2 - x1| + i y2 => { multiply out } (a i + b) = (c i + d) * ( 0 + i y1) (a i e^T + b) = (c i e^T + d) * (|x2 - x1| + i y2) => { equate real and imaginary parts of each equation } a = d * y1 b = -c * y1 a e^T = c e^T |x2 - x1| + d y2 b = d |x2 - x1| - c e^T y2 => { substitute a and b to get two equations in two variables } d y1 e^T = c e^T |x2 - x1| + d y2 -c y1 = d |x2 - x1| - c e^T y2 => { rearrange } c e^T |x2 - x1| + d (y2 - y1 e^T) = 0 c (-(e^T y2 - y1)) + d |x2 - x1| = 0 => { express in matrix form } M (c;d) = 0

Now, my first attempt failed miserably, because the standard methods to solve matrix equations see M (c;d) = 0 and end up finding the solution c = 0, d = 0, which makes a = 0 and b = 0, but this is useless as you end up with 0/0 which is undefined nonsense. Recalling linear algebra, I looked up the term null space and it indeed turned out to be what was needed here, namely to find a vector v such that M v = 0 and v /= 0. Handily there is an implementation in the hmatrix package, and as I only needed one such vector, the nullVector function was a perfect fit.

I made a demo of this technique in action (18MB), using ruff to find locations and gruff to render the animation.

References:

- metric
- Poincaré half-plane model metric (wiki)
- geodesic
- Poincaré half-plane model geodesic (wiki)
- null space
- Kernel matrix null space (wiki)

**ruff** (relatively useful fractal functions) is a
library for exploring fractals (currently concentrating on the
Mandelbrot Set).

**gruff** (GUI for ruff) is a program for exploring the
Mandelbrot Set using the facilities of ruff. Version 0.3 introduces a
gruff library, for programmatic generation of image specifications that
can then be fed into gruff for batch rendering
of diagrams with labels and rays,
or non-standard animations,
or series of images.

Of course traditional zooming animations are possible too.

**gruff-examples** is a package with some example programs
that use the gruff library, including a converter from old-format .gruff
files to the current format understood by gruff-0.3.

To get it all:

cabal update && cabal install gruff-examples

(add -fmpfr if you have a GHC with integer-simple)

ruff on hackage (including API documentation)

gruff on hackage (including usage information)

ruff-0.3.tar.gz (source code tarball)

gruff-0.3.1.tar.gz (source code tarball)

gruff-examples-0.3.1.tar.gz (source code tarball)

ruff, gruff, and gruff-examples on gitorious (development repositories)

]]>↑ hp2pretty-0.4 was slow | hp2pretty-0.5 is fast ↓

While trying to figure out
what the correct way to handle character codings in GHC heap profile
output is to fix **hp2pretty** so it doesn't output
invalidly encoded UTF-8 XML SVG (which breaks rsvg rendering), I
stumbled upon hp2any
which has somewhat related aims.

So to avoid duplication of effort, I hacked on hp2pretty to make it
use **hp2any-core** for as much as possible. Unfortunately,
while initial benchmarks seemed promising (if anything my original code
was slightly slower), when I tried it on a huge test input file it didn't
fare so well:

$ ls -1sh huge.hp36M huge.hp$ time hp2pretty-0.4 huge.hpreal 1m49.979s user 1m49.440s sys 0m0.500s -- observed memory usage in top: 100M$ time hp2pretty-0.4-with-hp2any-core huge.hpreal 2m52.501s user 2m51.470s sys 0m0.930s -- observed memory usage in top: 800M

Even though it took 50% longer and used 8x as much memory, it didn't magically fix my character encoding issues. But I thought it was on the right track to avoid duplication of effort, as hp2any has a nice API.

So I profiled heap usage, and got a huge spike with uninformative labels:

Adding cost centers helped to see where the problem might be:

Using pattern matching to deconstruct lists once to reuse the pieces fared better than multiple uses of head and tail, but adding strictness to the intermediate token data type in the parser was the biggest win:

When tested on the huge.hp file, it fared better than my original code:

$ time hp2pretty-using-patched-hp2any-core huge.hpreal 1m36.002s user 1m35.290s sys 0m0.680s -- observed memory usage in top: 500M

But once you get a taste for acceleration, it's hard to stop... further slownesses were targeted and squashed using time/allocation profiling:

- Faster printing!

using fshow from the floatshow package with Double7 newtype wrapper reduced runtime from 240 seconds to 43 seconds. This made the next big target more visible...COST CENTRE MODULE %time %alloc showF SVG 83.7 88.3

- Streamlined logic!

Inlining some association list creation and key lookup reduced runtime from 43 seconds to 36 seconds. Not much of an improvement, so I added some cost centre annotation pragmas...COST CENTRE MODULE %time %alloc parseHpLine Profiling.Heap.Read 54.2 63.0

- Faster parsing!

Using double from the attoparsec package with parseOnly driver reduced runtime from 36 seconds to 18 seconds (with profiling enabled).COST CENTRE MODULE %time %alloc pEndSample Profiling.Heap.Read 26.4 30.8 pBeginSample Profiling.Heap.Read 24.5 30.8

- What's left?

With showF as optimized as it can seemingly be, there's no standout target as a low-hanging fruit to be picked for big speed gainsCOST CENTRE MODULE %time %alloc showF SVG 39.1 38.9 main Main 9.7 11.5 bands Bands 8.6 4.3 readProfile Profiling.Heap.Read 7.4 8.6 buildIntegrals Profiling.Heap.Stats 7.0 6.3

So this round of painting on go faster stripes is done, and made the heap profile graphs go from looking somewhat like this:

to looking more like this:

Note the timescale at the bottom of the graphs, I neglected to keep the heap profile output files as well as the output images so I couldn't use hp2pretty's uniform scale options.

With profiling disabled, hp2pretty using the repatched hp2any-core is really quite a lot faster than the original hp2pretty-0.4:

$ time hp2pretty-using-repatched-hp2any-core huge.hpreal 0m9.919s user 0m9.520s sys 0m0.400s

Comparing very favourably to:

$ time hp2ps huge.hpreal 0m32.869s user 0m32.760s sys 0m0.090s

I've sent the relevant patches to hp2any-core to its upstream
maintainer, but in the meantime I backported the **floatshow**
and **attoparsec** usage to the master branch of hp2pretty,
and released v0.5:

**hp2pretty** v0.5 2011-10-15 acceleration

Vastly improved runtime performance thanks to 'floatshow' and 'attoparsec' for (respectively) printing and parsing floating point numbers.

Source code statistics: 544 lines, 3275 words, 20061 chars.

Download hp2pretty-0.5.tar.gz, or install from Hackage:

cabal install hp2pretty

or you can get the latest development source from hp2pretty/hp2pretty on Gitorious

For more info on actually generating heap profiles, check out the GHC heap profiling manual.

]]>