Some weeks ago we announced that Sandro Santilli joined the CartoDB team. He has been working hard on making PostGIS 2.0 and improving CartoDB, looking at queries we are making. Particularly he is been profiling rendering of CartoDB map tiles to find improvements opportunities.
CartoDB data is fetched from PostGIS and rendered by Mapnik. The pipeline hot spot was the rendering part, keeping the CPU pretty busy. How to calm it down ? Reduce the workload!
Mapnik workload is number of input vertices. What options do we have to reduce them? PostGIS has three functions that can be used to reduce the complexity of geometries: ST_SnapToGrid, ST_Simplify and ST_SimplifyPreserveTopology.
ST_SnapToGrid and ST_Simplify are available in PostGIS since 1.0.0 and before (circa 2003) and were implemented specifically to reduce workload of client-side mapping applications (where both browser memory and transfer time are a concern).
ST_SimplifyPreserveTopology was added in 1.3.3 (2008) to avoid the dimensional collapses and the introduction of invalidities in geometries being reduced. This one is implemented in GEOS so has the disadvantage of an overhead in conversion between PostGIS and GEOS geometry structures.
Let’s see how these methodologies compared.
The dataset under examination was a layer of 8,094 Italian municipality boundaries with a total of 4,845,240 vertices. We rendered the layer at 5 different zoom levels over a 600x400 image so that: level0 rendered all of Italy in 24x30 pixels, level1 rendered all Italy in 295x400 pixels, level2 only containd about 1/3 of the country, and level3 contained a single region. Parameters passed to simplifiers were of course dependent on zoom level.
Finally a run was made with pre-simplified vectors to make sure any dynamic simplification would only hit the worst case of having no effect. Timings were taken using ‘mapnik-speed-check’ running 10 times. They include data fetching (and thus query execution) and rendering.
In the schema below “SimpTP” is ST_SimplifyPreserveTopology, “Simp” is ST_Simplify, “Snap” is ST_SnapToGrid and “vanilla” is no preprocessing. With “v” you have the total number of vertices fetched by the query and “g” gives you number of non-empty geometries over total number of geometries.
In the image above, first map (called “vanilla” in the numbers following) takes more time to render; next image (“SimpTP”) loads faster. Here are the most interesting numbers:
level0 (full extent on 24x30 pixel)
level1 (full extent on 295x400 pixels)
level2 (1/3 of extent on 600x400 pixels)
level4 (1/10 of extent on 600x400 pixels)
level4 (1/10 of extent on 600x400 pixels - PRESIMPLIFIED!)
Generally speaking a simple ST_Simplify() call is what drop most vertices and it’s fast enough not to represent a problem even when there’s nothign to do for it.
Starting from there we contributed a patch to mapnik for exposing a style file parameter to request automatic simplification based on resolution. The patch is strictly for the PostGIS provider.
Congrats Sandro on this great work. This is not only great news for CartoDB users, but also for Openstreemap users which will potentially see their maps being rendered with Mapnik much faster.