CARTO users want to see maps drawn quickly, and drawing maps quickly means rendering every geometry in a table to an output format. By deploying the latest version of PostgreSQL to our infrastructure, we are taking advantage of improved parallel query support to make our maps and widgets go faster.

This would be the shortest blog post of all time except for one inconvenient fact: stock PostgreSQL 10 with stock PostGIS 2.4 doesn’t parallelize spatial queries very well at all.

In particular, for the kinds of routine calculations that CARTO runs, parallel processing doesn’t kick in:

  • Parallel scans still require higher function costs to come into action, even in reasonable cases.
  • Parallel joins on spatial conditions still seem to have poor planning, requiring a good deal of manual poking to get parallel plans.

To get the kind of parallel performance we wanted from PostgreSQL 10 + PostGIS 2.4 we had to tweak both packages a little bit.

  • PostgreSQL required two tweaks to the planner to generate more agressive parallel plans with PostGIS functions in them
  • PostGIS required a simple, yet extensive, tweak to add higher costs over the whole collection of PostGIS functions

PostgreSQL Planner Pain

For users of PostGIS, one of the big surprises (and disappointments) of stock parallel support in PostgreSQL 10 is that the planner only looks at the contents of the WHERE clause when determining if a scan should be parallelized.

-- This MAY be parallelized (if there enough rows)
FROM mytable
WHERE ST_Area(the_geom_webmercator) > 10000

On the other hand, a query with even a very costly spatial function (like a buffer) on the SELECT line (what PostgreSQL hackers call the “target list”) won’t get parallelized no matter how many rows are in play.

-- This WON'T be parallelized
SELECT ST_Buffer(the_geom_webmercator, 10) AS the_geom_webmercator 
FROM mytable

Fortunately for us at CARTO, this behaviour was pointed out on the pgsql-hackers email list only a couple months before we needed a fix.

If I have a slow function which is evaluated in a simple seq scan, I do not get parallel execution, even though it would be massively useful. Unless force_parallel_mode=ON, then I get a dummy parallel plan with one worker.
– Jeff Janes, pgsql-hackers, 2017-07-11

The ensuing discussion led to a revised patch from Amit Kapila of EnterpriseDB that would generate parallel plans for queries that included costly functions in the target list.

We took Amit’s patch and applied it to our PostgreSQL 10 fork, so we can deploy a stable PostgreSQL branch with improved parallelism behaviour for our use cases.

While Amit has continued to revise his patch over the succeeding months, it has not been committed to the main PostgreSQL development tree yet, so there’s no guarantee that PostgreSQL parallel behaviour for costly functions on the target list will improve in version 11.

The High(er) Costs of PostGIS

As a general proposition, the costs of PostGIS spatial functions are much much higher than the costs of ordinary functions in PostgreSQL. Even though all PostGIS functions are written in C, they still have to perform vastly more calculations to return a result than most standard PostgreSQL functions: computational geometry is complicated!

Prior to parallelism, the declared COST of a function would rarely effect the query plan generated by PostgreSQL – no matter what plan was chosen, the function result had to be calculated.

With parallelism though, the decision about whether or not to set up a parallel plan is a function of number of rows to be processed times function cost. A higher function cost can flip a plan over from single-worker to parallel.

PostGIS has not globally applied costs to functions in the past, because the planner had no use for it, and it’s a lot of work – there’s hundreds of functions in PostGIS. Why spend all that effort to add costs that would never actually affect a plan?

However, with Amit’s patch we had a use for costs, so we added explicit costs to a number of high-cost spatial functions.

CREATE OR REPLACE FUNCTION ST_Centroid(geography, use_spheroid boolean DEFAULT true)
 	RETURNS geography
 	AS 'MODULE_PATHNAME','geography_centroid'
  COST 100;

What, Still Broken?

At this point, we had:

  • added a PostgreSQL planner patch to honor function costs more effectively, and
  • adjusted PostGIS function costs appropriately to capture the high cost of spatial functions.

So, we should be done! But we weren’t.

A lot of very heavily used PostGIS functions are actually SQL wrappers that cover up more complex combinations of other functions. For example:

  FUNCTION ST_Intersects(geom1 geometry, geom2 geometry)
  RETURNS boolean
  AS 'SELECT $1 && $2 AND _ST_Intersects($1,$2)'

PostGIS expects that functions of this sort will be “inlined”. That PostgreSQL will, when finding one of these functions, replace the function with contents of the function definition. So:

WHERE ST_Intersects(a.geom, b.geom)

Would be silently inlined as:

WHERE a.geom && b.geom AND _ST_Intersects(a.geom, b.geom)

The trouble is that deep in the PostgreSQL code base is logic that determines if a given function can be inlined, and one of the things that determines whether inlining is allowed is if the functions being inlined are not too costly. And we just increased the costs of many of our functions.

The PostgreSQL inlining behaviour seems to be justified for users who use SQL wrapper functions as a way to force the caching of expensive calculations underneath the wrapper. That is never our use case at CARTO, so we don’t mind losing that behaviour in order to get back “sensible” inlining with our properly costed functions.

Again we applied a very small patch to our PostgreSQL 10 stable branch.

The problem of losing inlining behaviour was discussed on the pgsql-hackers list in November 2017, but no patch was offered, and our patch is not suitable for general application since it changes behaviours that some users might want to retain.

I Want That

The simplest way to ape the CARTO installation is to just use our patched PostgreSQL and PostGIS forks:

If you want to roll your own, you can: