New spatial functions in BigQuery, starting with Uber H3-js

Summary

CARTO is working very closely with the Google BigQuery team, bringing in new spatial functions and supporting the creation of a next generation spatial data infrastructure stack.

This post may describe functionality for an old version of CARTO. Find out about the latest and cloud-native version here.
New spatial functions in BigQuery, starting with Uber H3-js

As you may have seen in last week's announcement  we are now working very closely with the Google BigQuery team to support the creation of a next generation spatial data infrastructure stack.

BigQuery GIS does have support for some spatial functions  but this is a limited set that we believe will continue growing over time. In the meantime  if you need to perform a spatial operation that is not supported out of the box  you can create your own User Defined Functions. UDFs in BigQuery can be programmed in Javascript  for which there are a lot of spatial libraries already available. In order to use them  you have to make slight modifications  push them to cloud storage  and then reference them  which can be quite time consuming. Right now  every user of BigQuery would have to do the same  which is also a bit of a pain.

Luckily  BigQuery works as if it were a huge multitenant database  where all the databases of all users are on the same server  and there are only permissions separating them. This means you can create a database and give permissions to the entire world  so that everyone can use it for doing joins or calling functions. This is how the public data project works and the reason why we are collaborating to add location datasets. This means there is no need to install anything  allowing you to use these functions directly on your queries. For example:

{% highlight sql %}SELECT jslibs.h3.ST_H3(ST_GEOGPOINT(-74.00 40.7128) 7){% endhighlight %}

And all of these UDFs functions are now available as an Open Source project and live directly on BigQuery for anybody to use.

BigQuery JS Libs

BigQuery JS Libs: A repository of pre-packaged libraries to be used as functions inside BigQuery.

Here is a more complex SQL example:

{% highlight sql %}WITH data as (    SELECT jslibs.h3.compact(      jslibs.h3.ST_H3_POLYFILLFROMGEOG(tract_geom 11)) as geo       FROM bigquery-public-data.geo_census_tracts.census_tracts_new_york       WHERE geo_id='36081000100')

SELECT jslibs.h3.ST_H3_BOUNDARY(h3) as h3geo FROM data UNNEST(geo) as h3{% endhighlight %}

Location Data Visualization

Where Can You Run These functions?

The functions are deployed and ready to use  so they can be used as if they were just another regular function: jslibs.h3.ST_H3. Again  no need to install anything  just use them on your regular SQL on BigQuery.

As most of these functions are for spatial operations  it is useful to visualize them. For visualization  you can use BigQuery Geo Viz  QGIS or our CARTOframes library! Check it out in Google Collab.

What About Information Security?

Just a quick note that although all the functions are stored in a CARTO BigQuery project  when you use them in your SQL we do not get to "see" or know that the functions are being used  so there is no need to worry from a security perspective.

Starting with Uber H3

The open source project H3 is a hexagonal hierarchical geospatial indexing system  which is really useful when working with grid structures  and we use it extensively at CARTO as part of our Data Observatory.

 {% include icons/icon-bookmark.svg %}  Data Scientist Isaac Brodsky presented Uber's H3 project at this year's Spatial Data Science Conference Watch the recording today!

For this reason  we ported most of the H3 JS library API into BigQuery  allowing you to run the majority of H3 API functions. The best way to know which functions have been ported is just to look in the Github repo.

We have ported one single function (buffer) from the open source library Turf.JS to make buffers. Lots of Data Scientists need to do buffers in BigQuery so here they are:

{% highlight sql %}SELECT jslibs.turf.ST_BUFFER(ST_GEOGPOINT(-74.00 40.7128) 1 'kilometers' 10) as geo{% endhighlight %}

We have also added some support for Quadkeys and S2 libraries  but this is just the beginning.

If you have proposals for new functions to make available or a specific library that you would like to see ported please send us a Pull Request on the project or send us a note.

Special thanks to Mamata Akella  CARTO's Head of Cartography  for her support in the creation of the maps in this post.