Explore Open Buildings with CARTO's Spatial Analytics in BigQuery

Summary

Google’s Open Building dataset includes 516 million buildings covering 64% of the African continent. Learn how to analyze this dataset using CARTO's Spatial Analytics in BigQuery.

This post may describe functionality for an old version of CARTO. Find out about the latest and cloud-native version here.
Explore Open Buildings with CARTO's Spatial Analytics in BigQuery

Google recently released the Open Buildings dataset, including 516 million building footprints covering 64% of the African continent. Each building is described as a polygon with a confidence score indicating the degree of confidence that the building identified is indeed a building. There are many potential use cases for the data  including our recent visualization and blog post assessing the damage from a volcano on La Palma in the Spanish Canary Islands.

To analyze the entire continent, we can leverage BigQuery to store the information in one table and then use standard SQL to ask the data questions. In addition, to explore the entire continent at once  we can use CARTO’s BigQuery Tiler (part of CARTO's Spatial Analytics in BigQuery) to visualize all 516 million buildings on one map efficiently.

Loading the data into BigQuery

Open Buildings has hosted all of the building polygons in a series of compressed CSV files in Google Cloud Storage, totaling 46 GB. Instead of downloading the files locally and pushing them back up into BigQuery, we can use the bq command-line tool to create a table in BigQuery directly from the files in cloud storage.

Each of the compressed CSV files are stored in the path: gs://open-buildings-data/v1/polygons_s2_level_4_gzip/

We can use the following command to take each of the compressed CSV files (the *.csv.gz denotes all files ending in that extension) and create a single table in BigQuery. It will automatically determine a schema to use and create a series of jobs to build a table.

bq load --autodetect YOUR_DATASET_NAME.open_buildings_raw "gs://open-buildings-data/v1/polygons_s2_level_4_gzip/*.csv.gz"

When previewing the table in BigQuery, the schema should look like this:

Screenshot of table preview in BigQuery


Screenshot of table preview in BigQuery


Transforming the data

We now have the data of the buildings in a single BigQuery table. However, there are some refinements we can make to make geospatial analysis faster and easier.

  1. Select only the columns we need (in this case, the confidence score and the geometry).
  2. Create a column containing a GEOGRAPHY type, which is required for using geospatial functions in BigQuery and CARTO's Spatial Analytics in BigQuery (the raw table geometry column is type string from the import of the CSV files).
  3. Cluster the table by the GEOGRAPHY column, which will improve certain geospatial query performance.
create table `YOUR_PROJECT_NAME.YOUR_DATASET_NAME.open_buildings_clean` cluster by geom as
select confidence  ST_GeogFromText(geometry) as geom
from `YOUR_PROJECT_NAME.YOUR_DATASET_NAME.open_buildings_raw`

Creating the tileset

We can create a tileset using CARTO's Spatial Analytics in BigQuery with a single SQL command. The first argument is the source table (the clean table we just created). The second argument is the desired output tileset table name. Last is an optional argument  which could be null  causing the default parameters to be used. However  setting zoom_min to 10 and zoom_max to 20 will reduce the size and improve performance. Since we are viewing individual buildings  we do not need to visualize the data at a zoom level lower than 10.

CALL bqcarto.tiler.CREATE_TILESET(
  '`YOUR_PROJECT_NAME.YOUR_DATASET_NAME.open_buildings_clean`' 
  '`YOUR_PROJECT_NAME.YOUR_DATASET_NAME.open_buildings_clean_tileset`' 
  STRUCT
  (
    null AS name 
    null AS description 
    null AS legend 
    10 AS zoom_min 
    20 AS zoom_max 
    null AS geom_column_name 
    null AS zoom_min_column 
    null AS zoom_max_column 
    null AS max_tile_size_kb 
    null AS tile_feature_order 
    null AS drop_duplicates 
    null AS extra_metadata
  )
);

Creating the visualization

Using CARTO Builder, we can add the tileset, apply some styling based on confidence score and switch to a Google base map to see how the polygons line up versus Google Maps road and satellite imagery layers.

Screenshot of adding source from within CARTO Builder


Click "Add source from…" and select the tileset created earlier.

Screenshot of changing base map to Google Maps from within CARTO Builder


Click the base map menu option and switch the base map to Google Maps. Optionally we can switch to Satellite if we wish to compare the building polygons to actual imagery.

Screenshot of layer settings option from within CARTO Builder


Screenshot of changing fill colour from within CARTO Builder


Go into the "Layer settings", click the three vertical dots next to Fill Color, select the confidence column in the "Color Based On" field  and switch the steps to 3 by clicking on the color ramp.

Screenshot of styled map from within CARTO Builder


At the end of the steps outlined above  we can see a map with all of the building polygons shown like the screenshot above. We can move around to most areas of Africa and see each building allowing for rapid exploration.

A live map can be found here.