Enhancing Spatial Analytics in Snowflake with CARTO

Summary

As part of our commitment to enabling geospatial analysis natively within the cloud, today we announce the launch of our Spatial Analytics on Snowflake with CARTO!

This post may describe functionality for an old version of CARTO. Find out about the latest and cloud-native version here.
Enhancing Spatial Analytics in Snowflake with CARTO

Exciting news for Snowflake users! As part of our commitment to enabling geospatial analysis natively within the cloud, today we are announcing the launch of our Spatial Analytics on Snowflake with CARTO! Check out the official announcement on the Snowflake Blog.

We have been working with Snowflake for a number of months now to add more spatial capabilities to the popular data warehouse. Based on years of experience with  PostGIS development  this is a great milestone that brings the full power of Spatial SQL to the fast growing community of Snowflake users.

Most spatial analysis needs to be performed together with other 3rd party data  so we are also happy to announce the availability of our Data Observatory on the Snowflake Data Marketplace, giving users seamless access to location data to enhance their spatial analyses.

There are many CARTO users already leveraging Snowflake to power their data infrastructure  and now with full cloud native support for Snowflake they will benefit from:

  1. Increased performance. By running spatial analysis inside Snowflake  users will be able to utilize its parallelization capabilities.
  2. Convenience and interoperability. By adding spatial support directly as an extension inSQL  our users are able to integrate geospatial capabilities within their BI tools  ETLs  Low Code apps  etc.

Enhanced connectivity and geospatial capabilities

Back in September we announced the availability of our Snowflake connector letting CARTO users seamlessly ingest spatial datasets hosted in the Snowflake cloud data platform  as a first step to becoming Cloud Native. With our latest development we now provide a wealth of spatial capabilities built within Snowflake.

Spatial Analytics on Snowflake is distributed as a set of User Defined Functions (UDFs) that extends the capabilities of Snowflake for everything geo. It is available as a Snowflake data share so there is no need to install or manage anything special  they just complement your existing Snowflake data warehouse.

Screenshot of the Spatial Extension on the Snowflake Data Marketplace


This is a very innovative way of distributing new analytical capabilities, in fact, this is the first Snowflake Data Share that is used for distributing User Defined Functions. By working closely with Snowflake on this release we have been able to clearly see that this is going to be a fantastic distribution method for the future.We start by providing some foundational components  such as support for spatial indexes like H3 or S2  or functions for spatial measurements and further build on this by adding more advanced capabilities.

Ready to try out the CARTO Spatial Extension for Snowflake?

Request access on Snowflake Marketplace

We are releasing the CORE of the Spatial Analytics on Snowflake as Open Source under a BSD license. For the advanced package you will need a subscription to the CARTO platform.

Check out our documentation and a full reference to see what functions are currently available, and stay tuned as we will continue to add new ones.

Diagram showing the main components of the CARTO Spatial Extension for Snowflake


Some example queries that you can perform the Spatial Analytics

There are a number of examples already available within the documentation  so this is just a highlight of a few of them.

Computing US airports connections and routes interpolations

Compute all the paths that interconnect the main four US airports using Spatial Analytics ST_GREATCIRCLE

WITH data AS(
  SELECT 'SEA' AS abbrev  TO_GEOGRAPHY('POINT(-122.302289722924 47.4435819127259)') as geog UNION 
  SELECT 'MIA'  TO_GEOGRAPHY('POINT(-80.2789718277441 25.7949407212406)') UNION
  SELECT 'LAX'  TO_GEOGRAPHY('POINT(-118.402468548522 33.9441742543586)') UNION
  SELECT 'JFK'  TO_GEOGRAPHY('POINT(-73.7863268609295 40.6459595584081)')
)
SELECT sfcarto.transformations.ST_GREATCIRCLE(t1.geog  t2.geog  25) AS geom
FROM data AS t1
CROSS JOIN data AS t2
WHERE t1.abbrev != t2.abbrev

This query first creates all possible combinations between airports and then generates the paths between them using the ST_GREATCIRCLE function. The resulting paths contain 25 points, but you can set the number of points in order to make the lines smoother if needed (you could also exclude this parameter).

The result is displayed in this visualization. Notice that we are not using straight lines to interconnect the different airports  but great circles instead.

New supplier offices based on store locations clusters

In this example we are going to use points clustering to analyze where to locate 10 new supplier offices in the US so they can best serve all Starbucks locations.

WITH data AS(
  SELECT geog
  FROM sfcarto.public.starbucks_locations_usa
  WHERE geog IS NOT null
  ORDER BY id
) 
clustered_points AS
(
    SELECT sfcarto.clustering.ST_CLUSTERKMEANS(ARRAY_AGG(ST_ASGEOJSON(geog)::STRING)  10) AS cluster_arr
    FROM data
)
SELECT GET(VALUE  'cluster') AS cluster  TO_GEOGRAPHY(GET(VALUE  'geom')) AS geom 
FROM clustered_points  lateral FLATTEN(input => cluster_arr)

Analyzing store locations coverage using a Voronoi diagram

Voronoi diagrams are a very useful tool to calculate influence regions from a set of points and Spatial Analyticsprovides a convenient function to build them. An example application of these diagrams is the calculation of the coverage areas of a series of Starbucks stores. In the following query we are going to calculate these influence areas in Atlanta.

The result can be seen in the visualization below, where the color of each polygon indicates its area, which gives an insight on the coverage provided by each store.

##_INIT_REPLACE_ME_PRE_##
WITH starbucks AS
(
  SELECT geog
  FROM sfcarto.public.STARBUCKS_LOCATIONS_USA 
  WHERE CITY = 'Atlanta' AND geog IS NOT NULL
  ORDER BY id
) 
starbucks_array AS ( 
  SELECT ARRAY_AGG(ST_ASGEOJSON(geog)::STRING) AS geog_array
  FROM starbucks
) 
voronoi_array AS (
  SELECT sfcarto.accessors.ST_ENVELOPE(geog_array) AS envelope 
  sfcarto.processing.ST_VORONOIPOLYGONS(geog_array  ARRAY_CONSTRUCT(ST_XMIN(envelope)  ST_YMIN(envelope)  ST_XMAX(envelope)  ST_YMAX(envelope))) AS nested_voronoi
  FROM starbucks_array
)
SELECT TO_GEOGRAPHY(VALUE) AS geom  ST_AREA(geom) AS area FROM voronoi_array  lateral FLATTEN(input => nested_voronoi)

These are just a few examples of how some of the 60 or so geospatial functions currently available as part of the Spatial Analytics on Snowflake with CARTO can be used to seamlessly enhance user data with 3rd party sources and surface valuable location-based insights right inside the Snowflake cloud data platform.

If you would like a personalized demo of the Spatial Analytics on Snowflake, reach out to us on the link below. Our team is really excited to showcase its full potential!

 

   

     

map from Carto

   

   

     Want to see this in action?              Request a personalized demo