A roadmap for distributed Mandelbrot set rendering

Mandelbrot Set

Here are some notes on some thoughts I've had over the past couple of years, many of them sparked by email exchanges with Robert Munafo.

Racing tests

  1. Checking if the point is exterior by testing if iteration escapes to infinity won't terminate if it isn't exterior.
  2. Checking if the point is interior by testing if it is inside a hyperbolic component won't terminate if it isn't interior.
  3. Solution? Race the tests: perform both checks in parallel and pick the first one that terminates, then kill the other (non-terminating) check.
  4. Problem: the race will never end when the point is exactly on the boundary.
  5. Luckily most exact boundary points are not representable exactly in binary floating point, and most of the ones that are probably have sufficient noise from rounding errors to tip the balance to termination as interior or exterior.
  6. So far the only "pathological" boundary points I've found are on the real axis or at 0±i; inserting special case hacks for these are not too problematic.

Distance estimates

  1. Exterior distance estimates are relatively easy.
  2. Interior distance estimates are doable if you know the period.
  3. A method for computing interior distance estimate for a point c:
    period = 1
    loop {
      compute z0 such that f^p(z0,c) = z0 using Newton's method;
      if (Newton's method failed, for example, z escaped) { continue; }
      if (|d/dz f^p(z0, c)| < 1) {
        compute interior distance estimate using p, c, z0;
        if (interior distance estimate > 0) { return (p, distance); }
      }
      period++;
    }
  4. This method requires neither finding nuclei and tracing boundaries nor awkward numerical period detection and iteration (to find p and z0).
  5. Classify pixels as boring if (interior or exterior) distance estimate > 4 * pixel spacing.

Recursive subdivision

  1. Recursively subdividing an image is a win.
  2. 1/4 of the pixels need not be recomputed compared to the parent layer.
  3. Spatial coherency can be used to optimize calculations.
  4. Boring subpixels (those with only boring neighbours all of the same period) need not be calculated either: estimate its distance estimate by the geometric mean of its neighbours' distance estimates.
  5. But in any case, with the racing technique new tiles can be calculated without a reference to a parent, for exploring deep zooms without having to render the whole chain back up to the root.

Tiling schemes

  1. Square tiles with power-of-two dimensions are good (OpenGL etc).
  2. Boring subtiles (consisting entirely of boring pixels) can be skipped.
  3. Problem: tile edges - subdividing an MxN tile gives a (2M-1)x(2N-1) tile.
  4. Rejected solution: depend on neighbouring tiles in the parent layer (too complicated).
  5. Adopted solution: use power-of-two-plus-one tiles with one pixel overlap between neighbours.
  6. Edge subpixels will be computed twice but this is a small cost when tiles aren't tiny.
  7. Raw tiles can be cropped by 1 pixel at the bottom and left sides before saving as images, to avoid having to remove overlap when displaying.

Tile representation

  1. Only calculate the upper half-plane of the set, the other half is symmetrical.
  2. The root tile has southwest coordinate -4+0i and a size around 8, with a pixel spacing of 2^-N for some integer N.
  3. The bottom row of pixels on the x-axis is filled with dummy data (period -1, distance estimate +inf).
  4. Successive layers get closer and closer to the x-axis, while the axis itself is never calculated.
  5. Also necessary to ensure 0+i has the same dummy data.

Shared cache

  1. Directory structure:
    lock/
    boring.txt
    todo/*.raw
    done/*.raw
    tile/*.{png,jpg}
    work/[clientID]/boring.txt
    work/[clientID]/todo/*.raw
    work/[clientID]/done/*.raw
    work/[clientID]/tile/*.{png,jpg}
  2. Initially all directories are empty apart from the root tile in todo/M.raw, boring is an empty file, lock is an empty directory. Now each renderer:
  3. Creates its working directories work/[clientID]/{todo,done,tile}.
  4. Attempts to create a file in lock, failing if it already exists, retrying after a timeout.
  5. Moves the oldest (by file modification date) file in todo/*.raw to its own todo/ directory.
  6. Removes the lock file.
  7. Loads its todo/*.raw, subdivides it, saves its non-boring subtiles to its done/*.raw and image versions to its tile/*.{png,jpg}, logs the boring subtiles to its boring.txt.
  8. Acquires the lock.
  9. Moves work/[clientID]/done/*.raw to todo/*.raw; moves work/[clientID]/todo/*.raw to done/*.raw; moves work/[clientID]/tile/*.{png,jpg} to tile/*.{png,jpg}; appends work/[clientID]/boring.txt to boring.txt.
  10. Releases the lock.
  11. Locking should allow multiple renderers to cooperate and not clobber the cache, even across different machines with network file systems.

Web interface

  1. Concept: like openstreetmap.org, providing a scrollable zoomable interface.
  2. Initial implementation: no server-side code, use client-side scripting.
  3. Server index.html contains embedded javascript.
  4. Client-side javascript downloads the boring.txt and parses it into an associative array.
  5. Also parses any ?query=url&parameters=when they exist to set up the viewing parameters (x, y, r).
  6. Generates an HTML table with an img in each cell referencing the required url.
  7. Boring cells (those with any tile parent boring, parents are prefixes of the tile identifier) reference urls based on the boring period.
  8. Interesting cells reference urls based on the tile identifier itself.
  9. Server should return a blank tile image for non-existing tile urls, independent of client-side script.
  10. Navigating within the page (with buttons on the table frame for up/down/left/right/zoom-out, and clicking on the image for zooming in) doesn't reload the page, this is handled in the client-side script. New images may be fetched if required.
  11. A 'permalink' button freezes the current view state into an URL with appropriate ?query=parameters.

Social network

  1. Add server-side code to allow remote rendering clients to 'checkout' todo tiles and upload the results when done.
  2. Each rendering client is linked to a user account, users get magic beans for subdividing tiles.
  3. Validation of correct results might be done by comparing results from different users.
  4. Server keeps track of which unrendered tiles are most demanded, bumping priority in the todo queue.
  5. Maintain a database of user-curated "interesting things to see", with annotations / links to articles and prettier images.
  6. Link with a "feature database" of angled internal addresses etc.

Federated mirroring

  1. Multiple servers running same (or different if the API is fixed) code, allowing experimental features to be developed easily on your own machine.
  2. RSS feeds of new tiles, with daily / weekly batch downloads.
  3. Graceful handling of servers going offline in case someone wants a mirror on a laptop; also servers coming back online.