Spatial Features, the new derived dataset from CARTO

Oct 23, 2020

2 mins read

This post may describe functionality for an old version of CARTO. Find out about the latest and cloud-native version here.

Data is an essential ingredient for any spatial analysis; but often before any dataset can be mined for insights data scientists need to spend a considerable amount of time gathering cleansing and preparing data from a host of different sources. CARTO is committed to change this and to increase the efficiency of Data Science teams and their workflows.

A few weeks ago we announced a new release of our Data Observatory. CARTO’s spatial data platform enables Data Scientists to broaden the scope of their analysis with the latest and greatest in location datasets.

Today we are excited to announce the release of our first derivative data product: CARTO Spatial Features; which provides global demographic data and Point of Interest (POI) aggregations by category delivered in a common geographic support system.

Overcoming the change of support problem

The spatial data industry is booming with new and viable data sources becoming available at an unprecedented pace. This means that data scientists are faced with the ever-present problem of how to easily blend data in multiple formats. In Spatial Data Science this is a real challenge as datasets can be aggregated at a variety of different levels of geography.

Take for example a Consumer Packaged Goods (CPG) manufacturer trying to classify the areas surrounding its key retailers. Demographic data (e.g. population age and income) human mobility road traffic data and competitor locations–all of these variables are probably going to be expressed in different geographical scales (e.g. administration or census regions grids and coordinates). With CARTO Spatial Features we have curated multiple sources to provide data in a common and consistent geographical context.

This first version of Spatial Features available now in the CARTO Data Observatory as a premium dataset provides demographic variables such as total population and population by age and gender and Point of Interest (POI) aggregations in a curated set of standard categories (e.g. retail education healthcare tourism etc.). This initial set of derived data has been generated by processing and unifying sources from WorldPop and OpenStreetMap solving the ever-present change of support problem (COSP). The data is available globally and offered as well on a country by country basis.

Population at Quad Grid level 15

Data is available at two levels of resolution

For this first derivation our common geographic support of choice has been the Quadkey grid or Quad Grids; a division of the Earth into quadrants (using a planar projection) each uniquely defined by a string of digits with a length that depends on the level of zoom. In the Quad Grid system cells at different zoom levels are fully nested in a hierarchical parent-child relationship allowing for quick retrieval and efficient data storage. This also facilitates running spatial aggregations and visualizations on the web. Our Spatial Features datasets are available in two different resolutions Quad Grid level 15 (with cells of approximately 1km x 1km) and Quad Grid level 18 (with cells of approximately 100m x 100m).

POI aggregations in London at Quad Grid level 15

POI aggregations in London at Quad Grid level 18

Spatial Features is an evolving product and our Data team is already working on further expanding these datasets in subsequent releases with additional features including land use categorization and spatial lags such as distances to hospitals schools main roads etc as well as supporting other standard grid systems such as H3 hexabins.

Access Spatial Features sample data today

In order to encourage innovation and to gain valuable feedback from real use-cases using these derived datasets CARTO enterprise customers can now subscribe to the first version of CARTO Spatial Features datasets in the United States United Kingdom and Spain at no cost. These have been made available as public datasets in the Data Observatory and can be accessed free of charge.

Want to see this in action?

Request a live personalized demo