Complex squaring

Squaring a complex number is a key algorithm needed for rendering the Mandelbrot set, among other things. I noticed Kalles Fraktaler uses a method with 3 real number squares and no multiplications, so I compared a few algorithms:

naive
standard method commonly used as a definition
\[ (x + i y)^2 = (x^2 - y^2) + i (2 x y) \]
1 multiplication, 2 squares, 1 additions
Karatsuba
modified using difference of squares factorisation
\[ (x + i y)^2 = (x + y) (x - y) + i (2 x y) \]
2 multiplications, 2 additions
Kalles
as implemented by Kalles Fraktaler
\[ (x + i y)^2 = (x^2 - y^2) + i ((x + y)^2 - (x^2 + y^2)) \]
3 squares, 4 additions (the subexpressions can be shared)

Because the relative cost of these three operations (multiplication, squaring, addition - multiplication by 2 is constant time) varies depending on the precision of the real variables, I decided to benchmark the different algorithms. Here are the results from one machine, the values in the table are the number of iterations of \( z \to z^2 + c \) that could be performed with a 10 second timeout:

precisionnaiveKaratsubaKalles
100 28868287 40797727 24932491
316 20344558 27222494 19755743
1000 9549975 11122112 9755611
3162 2475382 2726978 2810427
10000 446160 502960 510899
31623 78343 88965 91640
100000 15013 17294 17419
316228 2955 3513 3403
1000000 591 760 673

The best algorithm at each precision is highlighted in green. As you can see, Karatsuba is best for low precisions and very high precisions, but between around 3000 and 100000 bits Kalles method is faster. I'm not sure if accuracy will suffer for any of these methods, at least the MPC library based on MPFR uses the Karatsuba algorithm in its mpc_sqr() function so it should be reasonably reliable.

You can download the source: complex-bench.c which contains compilation instructions at the top.