Spatial Features, the new derived dataset from CARTO

Data is an essential ingredient for any spatial analysis; but often, before any dataset can be mined for insights, data scientists need to spend a considerable amount of time gathering, cleansing, and preparing data from a host of different sources. CARTO is committed to change this and to increase the efficiency of Data Science teams and their workflows.

A few weeks ago we announced a new release of our Data Observatory. CARTO’s spatial data platform enables Data Scientists to broaden the scope of their analysis with the latest and greatest in location datasets.

Today, we are excited to announce the release of our first derivative data product: CARTO Spatial Features; which provides global demographic data and Point of Interest (POI) aggregations by category, delivered in a common geographic support system.

Overcoming the change of support problem

The spatial data industry is booming, with new and viable data sources becoming available at an unprecedented pace. This means that data scientists are faced with the ever-present problem of how to easily blend data in multiple formats. In Spatial Data Science this is a real challenge, as datasets can be aggregated at a variety of different levels of geography.

Take, for example, a Consumer Packaged Goods (CPG) manufacturer trying to classify the areas surrounding its key retailers. Demographic data (e.g., population, age, and income), human mobility, road traffic data, and competitor locations–all of these variables are probably going to be expressed in different geographical scales (e.g., administration or census regions, grids, and coordinates). With CARTO Spatial Features we have curated multiple sources to provide data in a common and consistent geographical context.

This first version of Spatial Features, available now in the CARTO Data Observatory as a premium dataset, provides demographic variables such as total population and population by age and gender, and Point of Interest (POI) aggregations in a curated set of standard categories (e.g., retail, education, healthcare, tourism, etc.). This initial set of derived data has been generated by processing and unifying sources from WorldPop and OpenStreetMap, solving the ever-present change of support problem (COSP). The data is available globally, and offered as well on a country by country basis.

Population at Quad Grid level 15

Data is available at two levels of resolution

For this first derivation, our common geographic support of choice has been the Quadkey grid or Quad Grids; a division of the Earth into quadrants (using a planar projection), each uniquely defined by a string of digits with a length that depends on the level of zoom. In the Quad Grid system, cells at different zoom levels are fully nested in a hierarchical parent-child relationship, allowing for quick retrieval and efficient data storage. This also facilitates running spatial aggregations and visualizations on the web. Our Spatial Features datasets are available in two different resolutions, Quad Grid level 15 (with cells of approximately 1km x 1km) and Quad Grid level 18 (with cells of approximately 100m x 100m).

POI aggregations in London at Quad Grid level 15
POI aggregations in London at Quad Grid level 18

Spatial Features is an evolving product, and our Data team is already working on further expanding these datasets in subsequent releases with additional features including land use categorization and spatial lags such as distances to hospitals, schools, main roads etc, as well as supporting other standard grid systems such as H3 hexabins.

Access Spatial Features sample data today

In order to encourage innovation and to gain valuable feedback from real use-cases using these derived datasets, CARTO enterprise customers can now subscribe to the first version of CARTO Spatial Features datasets in the United States, United Kingdom, and Spain at no cost. These have been made available as public datasets in the Data Observatory and can be accessed free of charge.

Javier Pérez Trufero

Javier is CARTO's Head of Data, running the company strategy with respect to third party data offerings and data science activities. Javier's responsibilities at CARTO range from establishing and coordinating new alliances with top-class data providers such as Vodafone and Mastercard, to contributing to the definition of CARTO's data products and data science projects.

