Late last year I implemented some coupled continuous cellular automata, inspired by Softology's experiments. Now I'm finally getting around to blogging about it. I used OpenGL shaders, here's some of the fragment shader source of the main algorithm:

void main() { vec4 s1 = texture(state, coord, 1.0); vec4 s100 = texture(state, coord, 100.0); vec4 s; for (int k = 0; k < 4; ++k) s[k] = texture(state, coord, blur[k])[k]; vec4 h = texture(history, coord); s = coupling * (s - s100) + h; s = speed * s; s = mix(s1, vec4(0.5) + 0.5 * cos(s), 0.125); h = mix(s, h, decay); state_out = s; history_out = h; }

The non-linearity of the `cos()`

on the coupled input acts like
a "reaction", the blurring (looking up reduced mipmap levels from the texture)
acts like "diffusion".

Colouring is done with another affine matrix transform, the output from which is thresholded and clamped, before edge-detection filter is applied. The edge-detection uses dFdx and dFdy, so the results are coarse (these derivatives are typically computed for blocks of 2x2 pixels, rendered together in parallel) - for better results the edge detection could be done in another pass, or the whole thing could run at double the resolution and be resized down to screen size afterwards.

Here's a video of it in action:

Here are some static images:

Here is another video, from January when it was still in colour:

Here's where you can get the code:

git clone https://code.mathr.co.uk/cca.git

Future work might be to do proper Gaussian blurs (it's separable, so even large radius might be feasible in real-time) instead of the cheap (but yielding squarish grid artifacts) mipmap reduction.

**EDIT** I worked on it some more, now in colour and with a
high quality mode that does Gaussian blur (on my system frame rate drops from
~60fps to between ~5fps and ~30fps depending on blur radius). Pictures:

I also added a mutation mode, which randomizes the parameters one by one at random. Here's a final example video showing off the new features:

]]>Downloads:

24bit FLAC (690 MB) VBR MP3 (78 MB) Ogg Vorbis (37 MB) C++ source code (4 kB)

A feedback process iteratively amplifies audio recognized from a control signal. In this version the control signal was one minute excerpted from the BBC Radio 4 drama serial "The Archers", and the source signal was 3 seconds of pink noise. Inspired by Deep Dream, but no neural networks were involved this time.

To be more precise, for each overlapping window of source audio, the program finds the nearest matching window of control audio, where the metric used is the vector difference of energy per octave. The match is added to the input, and appended to the source audio. The process continues over the rolling buffer, eventually feeding back on itself (the previous output is now the new input).

To be absolutely precise, here's the C++ source code using libsndfile and libfftw3:

/* Emergent Protocol (c) 2017 Claude Heiland-Allen <claude@mathr.co.uk> g++ -std=c++11 -Wall -Wextra -pedantic -O3 -o ep ep.cpp -lfftw3f -lsndfile ./ep control.wav source.wav output.wav */ #include <cassert> #include <cmath> #include <cstdlib> #include <cstring> #include <complex> #include <vector> #include <sndfile.h> #include <fftw3.h> #define CHANNELS 2 #define OCTAVES 11 #define FFTSIZE (1 << (OCTAVES + 2)) #define OVERLAP 4 #define SR 44100 #define LENGTH (SR * 60 * 60) struct energy { float e[CHANNELS][OCTAVES]; float a[FFTSIZE][CHANNELS]; }; float *window; float *fft_in; std::complex<float> *fft_out; fftwf_plan plan; std::vector<energy> analyse(const char *filename) { std::vector<energy> result; SF_INFO info = { 0, 0, 0, 0, 0, 0 }; SNDFILE *sndfile = sf_open(filename, SFM_READ, &info); assert(info.channels == CHANNELS); energy e; while (FFTSIZE == sf_readf_float(sndfile, &e.a[0][0], FFTSIZE)) { float sum = 0; for (int c = 0; c < CHANNELS; ++c) { for (int s = 0; s < FFTSIZE; ++s) { fft_in[s] = window[s] * e.a[s][c]; } fftwf_execute(plan); for (int i = 1, o = 0; o < OCTAVES; ++o, i <<= 1) { e.e[c][o] = 0; for (int s = i; s < i << 1; ++s) { sum += e.e[c][o] += std::norm(fft_out[s]); } } } for (int c = 0; c < CHANNELS; ++c) { for (int o = 0; o < OCTAVES; ++o) { e.e[c][o] /= sum; } } result.push_back(e); sf_seek(sndfile, -(OVERLAP - 1) * FFTSIZE / OVERLAP, SEEK_CUR); } sf_close(sndfile); return result; } float audio[LENGTH][CHANNELS]; void generate(const std::vector<energy> &analysis, const char *infilename, const char *outfilename) { memset(&audio[0][0], 0, LENGTH * CHANNELS * sizeof(audio[0][0])); SF_INFO srcinfo = { 0, 0, 0, 0, 0, 0 }; SNDFILE *srcsndfile = sf_open(infilename, SFM_READ, &srcinfo); assert(srcinfo.channels == CHANNELS); sf_count_t r = 0; sf_count_t w = sf_readf_float(srcsndfile, &audio[r][0], LENGTH); sf_close(srcsndfile); while (w + FFTSIZE < LENGTH) { energy e; float sum = 0; for (int c = 0; c < CHANNELS; ++c) { for (int s = 0; s < FFTSIZE; ++s) { fft_in[s] = window[s] * audio[r + s][c]; } fftwf_execute(plan); for (int i = 1, o = 0; o < OCTAVES; ++o, i <<= 1) { e.e[c][o] = 0; for (int s = i; s < i << 1; ++s) { sum += e.e[c][o] += std::norm(fft_out[s]); } } } for (int c = 0; c < CHANNELS; ++c) { for (int o = 0; o < OCTAVES; ++o) { e.e[c][o] /= sum; } } auto target = analysis.begin(); float m = 1.0f/0.0f; for (auto i = analysis.begin(); i != analysis.end(); ++i) { float s = 0; for (int c = 0; c < CHANNELS; ++c) { for (int o = 0; o < OCTAVES; ++o) { float d = (*i).e[c][o] - e.e[c][o]; s += d * d; } } if (s < m) { m = s; target = i; } } for (int c = 0; c < CHANNELS; ++c) { for (int s = 0; s < FFTSIZE; ++s) { audio[w + s][c] += window[s] * sin(audio[r + s][c] - (*target).a[s][c]); } } r += FFTSIZE / OVERLAP; w += FFTSIZE / OVERLAP; } SF_INFO dstinfo = { 0, SR, CHANNELS, SF_FORMAT_WAV | SF_FORMAT_FLOAT, 0, 0 }; SNDFILE *dstsndfile = sf_open(outfilename, SFM_WRITE, &dstinfo); sf_writef_float(dstsndfile, &audio[0][0], LENGTH); } int main(int argc, char **argv) { if (! (argc > 3)) return 0; window = (float *)fftwf_malloc(FFTSIZE * sizeof(*window)); float g = 0.25; for (int s = 0; s < FFTSIZE; ++s) { window[s] = g * (1 - std::cos(2 * 3.141592653589793 * s / FFTSIZE)); } fft_in = (float *)fftwf_malloc(FFTSIZE * sizeof(*fft_in)); fft_out = (std::complex<float> *)fftwf_malloc(FFTSIZE * sizeof(*fft_out)); plan = fftwf_plan_dft_r2c_1d(FFTSIZE, fft_in, (float (*)[2])fft_out, FFTW_PATIENT | FFTW_PRESERVE_INPUT); generate(analyse(argv[1]), argv[2], argv[3]); fftwf_destroy_plan(plan); fftwf_free(fft_out); fftwf_free(fft_in); return 0; }

Eventually I want to learn enough about machine learning to re-implement this idea using neural networks, perhaps trained on a variety of classical instrument sounds, to create an orchestral noise symphony.

]]>I hacked on Inflector Gadget to make it use double precision floating point (53 bits of mantissa compared to 24 for single precision). This allows more playing time before ugly pixelation artifacts appear. It requires OpenGL 4, if that doesn't work on your machine then you'll have to go back to v0.1 (sorry). Downloads:

inflector-gadget-0.2.tar.bz2 (sig)

|inflector-gadget-0.2-win.zip (sig)

or get the freshest source from git (browse inflector-gadget):

git clone https://code.mathr.co.uk/inflector-gadget.git

There's also a new command, 'D', which prints out the inflection coordinates. I added it so I could play with adaptive precision algorithms, more on that soon.

]]>This evening (18:00 UTC) I'll be streaming live a performance with clive (environment for live-coding in C) to the Audioblast Festival #5 of experimental noise performance.

You can listen to the stream at:

http://apo33.org:8000/audioblast.ogg.m3u

You can also join the IRC channel at #apo33 on freenode.

**UPDATE** I uploaded my set:
audio + code diff-cast.

A Latin square of order \(n\) is a matrix of \(n^2\) values each in the range \(\{1, 2, ,\ldots, n\}\), such that each value occurs exactly once in each row and each column. The number of Latin squares goes up very quickly with \(n\): see A002860 in the Online Encyclopedia of Integer Sequences. A subset is that of reduced Latin squares, where the first row and the first column are the sequence \((1 2 \ldots n)\) (counted by A000315). And a third group is Latin squares with the first row fixed as \((1 2 \ldots n)\) and no condition on the first column: A000479.

While answering a question on math.SE, I noticed the OEIS has very few terms of another sequence related to Latin squares, namely the number of classes of "structurally equivalent" Latin squares, where equivalence is over rotations, reflections, and permuting the symbols. The computer programs I wrote to search for the answers to the question finished in a long but manageable amount of time, so I wrote a program to search for the next term of A264603:

// gcc -std=c99 -Wall -Wextra -pedantic -O3 -march=native A264603.c// ./a.out order#include <stdio.h> #include <stdlib.h> #include <string.h>// orderstatic int O = 0;// generated squarestatic char *square = 0;// buffer for normalization of symmetrical squaresstatic char *squares = 0;// counter for progressstatic long long total = 0;// counter for uniquesstatic long long unique = 0;// make first row be ABC... in-placestatic inline void relabel(char *s) { char label[O]; for (int i = 0; i < O; ++i) label[s[i] - 'A'] = 'A' + i; for (int i = 0; i < O; ++i) for (int j = 0; j < O; ++j) s[O*i+j] = label[s[O*i+j] - 'A']; }// wrap strcmp with comparator typestatic int compare(const void *a, const void *b) { return strcmp(a, b); }// find lexicographically least of all relabeled symmetries// this acts as the canonical representative for the structure classstatic inline void normalize() {// regularint k = 0; for (int i = 0; i < O; ++i) for (int j = 0; j < O; ++j) squares[k+O*i+j] = square[O*i+j]; relabel(&squares[k]);// rotated 90k += O * O + 1; for (int i = 0; i < O; ++i) for (int j = 0; j < O; ++j) squares[k+O*(O-j-1)+i] = square[O*i+j]; relabel(&squares[k]);// rotated 180k += O * O + 1; for (int i = 0; i < O; ++i) for (int j = 0; j < O; ++j) squares[k+O*(O-i-1)+(O-j-1)] = square[O*i+j]; relabel(&squares[k]);// rotated 270k += O * O + 1; for (int i = 0; i < O; ++i) for (int j = 0; j < O; ++j) squares[k+O*j+(O-i-1)] = square[O*i+j]; relabel(&squares[k]);// reflect Ik += O * O + 1; for (int i = 0; i < O; ++i) for (int j = 0; j < O; ++j) squares[k+O*(O-i-1)+j] = square[O*i+j]; relabel(&squares[k]);// reflect Jk += O * O + 1; for (int i = 0; i < O; ++i) for (int j = 0; j < O; ++j) squares[k+O*i+(O-j-1)] = square[O*i+j]; relabel(&squares[k]);// reflect IJk += O * O + 1; for (int i = 0; i < O; ++i) for (int j = 0; j < O; ++j) squares[k+O*j+i] = square[O*i+j]; relabel(&squares[k]);// reflect JIk += O * O + 1; for (int i = 0; i < O; ++i) for (int j = 0; j < O; ++j) squares[k+O*(O-1-j)+(O-1-i)] = square[O*i+j]; relabel(&squares[k]);// normalizeqsort(squares, 8, O * O + 1, compare); }// return 1 if square is not Latin at index i,jstatic inline int prune(int i, int j) { char symbol = square[O*i+j]; for (int q = 0; q < j; ++q) if (symbol == square[O*i+q]) return 1; for (int p = 0; p < i; ++p) if (symbol == square[O*p+j]) return 1; return 0; } static inline void output(void) {// output normalized representationnormalize(); if (! compare(square, squares)) unique++;// report progresstotal++; if ((total & 0xFFFF) == 0) fprintf(stderr, "\r%lld %lld ", total, unique); }// depth first search across space of Latin squares with pruningstatic void generate(int i, int j) { if (j == O) { i += 1; j = 0; } if (i == O) { output(); return; } if (i == 0) {// first row is ABC... wlogsquare[O*i+j] = 'A' + j; generate(i, j + 1); } else {// try each possibility for next cellfor (int k = 0; k < O; ++k) { square[O*i+j] = 'A' + k; if (prune(i, j)) continue; generate(i, j + 1); } } }// entry pointint main(int argc, char **argv) { if (argc > 1) O = atoi(argv[1]); if (! (0 < O)) { fprintf(stderr, "usage: %s order\n", argv[0]); return 1; } square = calloc(1, O * O + 1); squares = calloc(1, 8 * (O * O + 1)); generate(0, 0); printf("\norder: %d\ntotal: %lld\nunique: %lld\n", O, total, unique); return 0; }

For order 1 through 6 it matches up with the OEIS page, and for order 7 the output after around 16 hours of computation is:

\[1524901344\]

You can download the C source code: A264603.c

]]>(Graph labels are missing "~", couldn't figure out how to display "~" with GNUPlot...)

Pure-data's [expr~] is convenient for writing arithmetic expressions involving signal vectors, but unfortunately it has quite an overhead - it's much slower than patching objects together, [fexpr~] even more so.

This is because [expr~] interprets the parsed expression each DSP block, and because its inner loops don't have the unrolling tricks of Pd's signal arithmetic objects (search the source code for perf8 or perform8 or similar to see what I mean by that).

One way of solving it I thought could be for [expr~] to dsp_add() the corresponding inner loop perform functions at DSP chain recompilation time (ie, not very often). But that would require allocating signal vectors separately, as there is no way as far as I know to grab spare signal vectors from Pd's free pool (without having inlets or outlets to own them, that is). This would be a lot of work to get right.

An alternative (and the one I favour this moment) might be for [expr~] to create a new canvas behind the scenes and patch together the primitive objects making up the expressions. The more I think about it the more it solves problems: no need to duplicate the inner loops in multiple classes, no need to allocate (and deallocate) signal vectors, Pd can do its cache-locality beneficial signal vector recycling, the DSP chain compiler for patches does the topological sort so we don't have to, ...

Not sure how much it would benefit [fexpr~], but it wouldn't be too hard(?) to add support using [block~ 1] and [delwrite~]/[delread~] for the feedback signals. And I think [expr] could stay as it is, not so likely to be much of a hot code path.

A stepping stone to [expr~] dynamic patching might be to write an offline processor, that replaces instances of [expr~] in patch files by subpatches.

Does anyone feel up to the challenge? I'm not doing so much any more with Pd these days, so I'll pass.

You can download the C and Pd source code for running the benchmarks: thoughts_on_expr.tar.bz2

]]>Inspired by recent threads on FractalForums, I wrote a little gadget to do inflection mapping. This means translation and squaring of complex coordinates before regular Mandelbrot or Julia set iterations. Check the source for full details, the fragment shader GLSL is the place to start. You can download it here:

inflector-gadget-0.1.1.tar.bz2 (sig)

|inflector-gadget-0.1-win.zip (sig)

or get the freshest source from git (browse inflector-gadget):

git clone https://code.mathr.co.uk/inflector-gadget.git

It's a bit rough around the edges, press H to print the help to the terminal you started it from (or read the documentation), but in short, each click wraps the pattern twice around the clicked point, allowing you to sculpt patterns out of Julia sets (or the Mandelbrot set if you press M).

Similar sculpting can be achieved by zooming into specfic locations in the Mandelbrot set. The effect is much the same (the outer decorations are missing in the inflector gadget) but it takes a zillion times longer to zoom so deep as required.

]]>In my previous post I tried to solve a differential equation using a port of Octave's adaptive Runge-Kutta 4/5 integration algorithm. It failed with energy explosions. Today I tried a different, much simpler, integration algorithm, and the energy explosions seem to be solved.

Velocity Verlet integration can be implemented for this problem in a few lines of code:

#include <math.h>// compute acceleration from positionstatic inline void f(double *a, const double x[2]) { a[0] = -exp(x[1] * x[1]) * x[0]; a[1] = -exp(x[0] * x[0]) * x[1]; }// Velocity Verlet integrationvoid compute() { const double h = 0.01; double x[2] = { 0, 1 }; double v[2] = { 1, 1 }; double a[2] = { 0, 0 }; f(a, x); v[0] += 0.5 * h * a[0]; v[1] += 0.5 * h * a[1]; while (1) { x[0] += h * v[0]; x[1] += h * v[1]; f(a, x); v[0] += h * a[0]; v[1] += h * a[1]; } }

You can download C99 source code for a JACK client which sonifies the chaotic coupled oscillators. I've been running it for 45mins at 48kHz sample rate, and no explosions yet, but be careful in case this is still transient behaviour...

No explosions may seem like a good thing, but what if the original differential equations are truly explosive, and the stability is a computational artifact? I still don't know the answer to that question.

]]>Simple harmonic motion is the solution to the differential equation:

\[\frac{\partial^2}{\partial t^2} x = -\omega^2 x\]

Interested in chaos I wanted to make the angular frequency \(\omega\) be cross-coupled in a pair of oscillators, and I came up with this differential equation:

\[\begin{aligned} \frac{\partial^2}{\partial t^2} x &= -e^{y^2} x \\ \frac{\partial^2}{\partial t^2} y &= -e^{x^2} y \end{aligned}\]

It sounds something like this: audio snippet.

I initially experimented with Octave's ode45() function, but it was rather slow, so I ported it to C99 (specialized to 4-vectors containing the displacement and velocity of each oscillator). Unfortunately it exploded after some time, with the amplitude of the oscillators swinging ever-larger, and the frequency of oscillation getting very very high too (which meant that the adaptive step size Runge-Kutta integration scheme would effectively get stuck and stop making progress).

Investigating this crisis, I thought to plot the energy of the system, and sure enough it exploded:

So this experiment failed, I'll have to try some other coupling expressions to see if they suffer the same fate or otherwise. Eventually I wanted to try controlling chaos by small perturbations to nudge the oscillators into unstable periodic orbits of various kinds, but no joy this week.

You can download the C source code for the integration calculations, gnuplot source code for the diagrams, and Octave source code for converting to audio.

]]>After instrumenting Monotone with OpenGL timer queries I could see where the major bottleneck lay:

IFS( 7011.936000 ) FLAT( 544.672000 ) PBO( 2921.728000 )SORT( 6797.760000 ) LUP( 71136.064000 )TEX( 284.224000 ) DISP( 272.480000 )

LUP is the per-pixel binary search lookup for histogram equalisation (to compress the dynamic range of the HDR fractal to something suitable for display), the previous SORT generates the histogram from a 4x4 downscaled image. A quick calculation shows that this LUP is taking 80% of the GPU time, so is a good focus for optimisation efforts.

The 4x4 downscaled image for the histogram is still a lot of pixels: 129600. LUP involves finding an index into this array, which gives a value with around 17bits of precision. However, typical computer displays are only 8bit (256 values) so the extra 9 random-access texture lookups per pixel to get a more accurate value are a waste of time and effort. Combined with a reduction of the downscaled image to 8x8, the optimisation to compute a less accurate (but visually indistinguishable) histogram equalisation allows Monotone to now run at 30fps at 1920x1080 full HD resolution. Here are the post-optimisation detailed timing metrics:

IFS( 7087.104000 ) FLAT( 509.888000 ) PBO( 2744.864000 )SORT( 1409.440000 ) LUP( 15696.352000 )TEX( 281.472000 ) DISP( 290.848000 )

A productive day!

]]>December's calendar image was a reaction-diffusion simulation, part of the RDEX project. If you want to explore the variety of patterns that emerge from this kind of stuff, a static mirror of the Kiblix RDEX server is still online to play with.

]]>My video piece Monotone has been accepted to MADATAC 08 (2017). The exhibition runs January 12th to February 5th, at Centro Conde Duque, Madrid, Spain, and there is also a screening on January 17th.

There were some issues with video codecs, this is the one that worked out:

]]>ffmpeg -i video.mkv -i audio.wav \ -pix_fmt yuv420p -codec:v libx264 -profile:v high -level:v 4.1 -b:v 20M -b:a 192k \ monotone.mov