Measuring tile rendering processes to improve performance

One of the most important parts of our infrastructure is how we generate tiles. This process consists of two parts: gathering data and rendering tiles. As part of our mission to maintain the best possible product for our clients, we set out to improve the performance of our tiles. To do so, we needed to better understand and measure the tile rendering process. We determined that data gathering had a much wider improvement margin due to it’s time and resource consumption compared to actual tile rendering.

Put simply, the tile rendering process has two main steps:

  1. Gather the required data from the database filtered by tile bounding box.
  2. Send this data to the renderer and generate the tile, which can be a raster tile (png), a vector tile (mvt), a grid, and so on.

Depending on the size of the table, how the data is queried, and other more minor factors, calls to the database could take anywhere from 10 milliseconds to 20 seconds. We knew we were rendering the same tiles multiple times, but we didn’t know the implications of this on rendering performance. In any case, we didn’t want to keep executing the same query without having a good reason to do so.

Our first task was to understand how many times we were drawing the same tile by gathering the necessary data. The only reliable source at the time to obtain this data was our database logs and we needed to process our PostgreSQL logs and extract information such as the query execution time, the XYZ coordinates of the requested tile, the db, the affected tables, the username, etc.

There are great log analyzers for PostgreSQL, like pgbadger. However, these analyzers did not allow us to run operations such as transforming the tile bounding box to XYZ format. Because of this, we ended up creating a custom parser to better serve our needs.

Processing the PostgreSQL logs was a challenge that exceeded our original assessment. While there is a common and configurable prefix for each log line, everything else could be custom and we discovered many consistency issues. We faced a series of challenges with our approach:

  • Some queries were split between different lines and needed to be merged
  • There was a ton of information we did not need slowing down our processes
  • There were inconsistencies in line formatting
  • Prepared statements have three parts: parse, bind and execute with different execution times we needed to aggregate.
  • We had huge amounts of log data to process

Since we are working with geospatial databases, we needed to extract the XYZ value from the bounding box of the query, originally stored as coordinates in the 3857 projection:

BOX3D(-2504688.542848656 -20037508.3,20037508.3 2504688.542848656)'::box3d, 3857)

In addition to this, we had to take into account the possible values for metatiling and buffer-size that we were using for each query to help us make better assumptions about optimization strategies.

Here is an example of a query used to generate a tile:

Mar 27 11:44:44 host postgres[13993]: [2254-1] 2017-03-27 11:44:44 GMT publicuser databasename [unknown] LOG:  duration: 2465.723 ms  execute <unnamed>: SELECT ST_AsTWKB(ST_Simplify(ST_RemoveRepeatedPoints("the_geom_webmercator",10000),10000,true),-4) AS geom FROM (select
Mar 27 11:44:44 host postgres[13993]: [2254-2] #011  c.the_geom,
Mar 27 11:44:44 host postgres[13993]: [2254-3] #011  st_transform(st_makevalid(c.the_geom_webmercator), 3786) as the_geom_webmercator,
Mar 27 11:44:44 host postgres[13993]: [2254-4] #011  s.iso,
Mar 27 11:44:44 host postgres[13993]: [2254-5] #011  s.score
Mar 27 11:44:44 host postgres[13993]: [2254-6] #011FROM scores as s JOIN countries as c on c.iso = s.iso
Mar 27 11:44:44 host postgres[13993]: [2254-7] #011WHERE filter='foo'
Mar 27 11:44:44 host postgres[13993]: [2254-8] #011) as cdbq WHERE "the_geom_webmercator" && ST_SetSRID('BOX3D(-2504688.542848656 -20037508.3,20037508.3 2504688.542848656)'::box3d, 3857)

and this is the result in CSV format after we process the log:

host,database,user,tables,timestamp,duration,bbox,x,y,z,query,update,basemaps
host,databasename publicuser,"{countries,scores}",2017-03-27 11:44:45,2313.126,"-2504688.542848656,-2504688.542848656,20037508.3,20037508.3",2,0,2,"SELECT ST_AsTWKB(ST_Simplify(ST_RemoveRepeatedPoints(""the_geom_webmercator"",10000),10000,true),-4) AS geom FROM (select c.the_geom, st_transform(st_makevalid(c.the_geom_webmercator), 3786) as the_geom_webmercator, s.iso, s.score FROM scores as s JOIN countries as c on c.iso = s.iso WHERE filter='foo' ) as cdbq WHERE ""the_geom_webmercator"" && ST_MakeEnvelope(-2504688.542848656,-2504688.542848656,20037508.3,20037508.3,3857)",False,False

As you can see, this gives us more user friendly data to work with.

Once we had processed all the logs, we loaded the resulting data into the database which then allowed us to extract the most useful information. Now we know query runtimes and how many times we’re generating the same tile to help guide us in improving the process.

Do you want to know more about how we improve our tile generation process? More CARTO blogpost are coming to our Inside blog.

Did you like it? Join us

About the author

Senior Backend Engineer at CARTO.

More posts from Mario de Frutos

Related Posts

Ready to optimize your territories with Location Intelligence?

Close circle icon

Contact us

Please fill out the below form and we'll be in touch real soon.