CARTO is open source, and is built on core open source components, but historically most of the code we have written has been in the “CARTO” parts, and we have used the core components “as is” from the open source community.
While this has worked well in the past, we want to increase the velocity at which we improve our core infrastructure, and that means getting intimate with the core projects: PostGIS, Mapnik, PostgreSQL, Leaflet, MapBox GL and others.
Our new core technology team is charged with being the in-house experts on the key components, and the first problem they have tackled is squeezing more performance out of the core technology for our key use cases. We called the project “5x” as an aspirational goal – can we get multiples of performance improvement from our stack? We knew “5x” was going to be a challenge, but by trying to get some percentage improvement from each step along the way, we hoped to at least get a respectable improvement in global performance.
A typical CARTO visualization might consist of a map and a couple widget elements.
The map will be composed of perhaps 12 (visible) tiles, which the browser will download in parallel, 3 or 4 at a time. In order to get a completed visualization delivered in under 2 seconds, that implies the tiles need to be delivered in under 0.5s and the widgets in no more than 1s.
Ideally, everything should be faster, so that more load can be stacked onto the servers without affecting overall performance.
The time budget for a tile can be broken down even further:
The time budget for a widget is basically all on the database:
The project goal was to add incremental improvements to as many slices as possible, which would hopefully together add up to a meaningful difference.
In order to continuously improve the core components, we needed to monitor how changes affected the overall system, against both a long-term baseline (for project-level measurements) and short-term baselines (for patch-level measurements).
To get those measurements, we:
Using the measurements as a guide, we then attacked the performance problem.
Profiling and running performance tests, and doing a little bit of research showed up three major opportunities for performance improvements:
Most importantly, during our work on core code improvements, we brought all the core software into the CARTO build and deployment chain, so these and future improvements can be quickly deployed to production without manual intervention.
We want to bring our improvements back to the community versions, and at the same time have early access to them in the CARTO infrastructure, so we follow a policy of contributing improvements to the community development versions while back-patching them into our own local branches (PostGIS, Mapnik, PostgreSQL).
Did we get to “5x”? No, in our end-to-end benchmarks, we notched a range of different improvements, ranging from a new percent to a few times, depending on the use cases. We also found our integration benchmarks were sensitive to pressure from other load on our testing servers, so relied mostly on micro-benchmarks of different components to confirm local performance improvements.
While the performance improvements have been gratifying, some of the biggest wins have been the little improvements we made along the way:
We made a lot of performance improvements across all the major projects in which CARTO is based upon: you may have already noticed those improvements. We’ve also shown that optimizations can only get you that so far – sometimes taking a whole new approach is a better plan. A good example of this is the vector and raster data aggregations work we have been doing, reducing the amount of data transfered with clever summarization and advanced styling.
More changes from our team are still rolling out, and you can expect further platform improvements as time goes on. Keep on mapping!
Please fill out the below form and we'll be in touch real soon.